V2L.TEX[TEX,DEK]1 - www.SailDart.org

perm filename V2L.TEX[TEX,DEK]1 blob sn#381028 filedate 1978-09-20 generic text, type C, neo UTF8
COMMENT ⊗   VALID 00022 PAGES
C REC  PAGE   DESCRIPTION
C00001 00001
C00004 00002	%folio 518 galley 1 (C) Addison-Wesley 1978	-
C00016 00003	%folio 520 galley 2 (C) Addison-Wesley 1978	-
C00027 00004	%folio 522 galley 3 (C) Addison-Wesley 1978	-
C00038 00005	%folio 524 galley 4 (C) Addison-Wesley 1978	-
C00045 00006	%folio 526 galley 5 (C) Addison-Wesley 1978	-
C00055 00007	%folio 528 galley 6 Beginning lost. (C) Addison-Wesley 1978	-
C00063 00008	%folio 530 galley 7 (C) Addison-Wesley 1978	-
C00079 00009	%folio 535 galley 8 (C) Addison-Wesley 1978	-
C00100 00010	%folio 539 galley 9 Almost total loss. (C) Addison-Wesley 1978	-
C00104 00011	%folio 541 galley 10 (C) Addison-Wesley 1978	-
C00117 00012	%folio 543 galley 11 (C) Addison-Wesley 1978	-
C00129 00013	%folio 545 galley 12 (C) Addison-Wesley 1978	-
C00144 00014	%folio 550 galley 13 (C) Addison-Wesley 1978	-
C00164 00015	%folio 558 galley 14 (C) Addison-Wesley 1978	-
C00182 00016	%folio 568 galley 15 (C) Addison-Wesley 1978	-
C00193 00017	%folio 570 galley 16 (C) Addison-Wesley 1978
C00208 00018	%folio 574 galley 17 WARNING: Much of this tape unreadable! (C) Addison-Wesley 1978
C00218 00019	%folio 579 galley 18 (C) Addison-Wesley 1978
C00231 00020	%folio 584 galley 19 (C) Addison-Wesley 1978
C00243 00021	%folio 587 galley 20 (C) Addison-Wesley 1978
C00254 00022	
C00255 ENDMK
C⊗;
%folio 518 galley 1 (C) Addison-Wesley 1978	-
|newtype 58320---Computer Programming\quad (A.-W./Knuth)\quad
ch. 4\quad f. 518\quad g. 1

|qleft \12skip {\:r EXERCISES

\exno 1. [10] If we are
doing polynomial arithmetic modulo 10, what is $7x + 2$ minus
$x↑2 + 3?$ What is $6x↑2 + x + 3$ times $5x↑2 + 2?$

\exno 2. [17] True or false:\xskip (a) The product of monic polynomials
is monic.\xskip (b) The product of polynomials of respective degrees
$m$ and $n$ has degree $m + n$.\xskip (c) The sum of two polynomials
of degree $n$ is of degree $n$.

\exno 3. [M20] If each of the coefficients $u↓r$, $\ldotss$, $u↓0$,
$v↓s$, $\ldotss$, $v↓0$ in (4) is an integer satisfying the conditions
|$u↓i| ≤ m↓1, |v↓j| ≤ m↓2$, what is the maximum absolute value
of the product coefficients $w↓k?$

\exno 4. [21] Can the multiplication of polynomials modulo 2
be facilitated by using the ordinary arithmetic operations on
a binary computer, if coefficients are packed into computer
words?

\exno 5. [M21] Show how to multiply
two polynomials of degree ≤$n$, modulo 2, with an execution
time proportional to $O(n$$↑{lg3}), when n$ is large, by adapting
the Karatsuba--Ofman method described in Section 4.3.3.

|qleft \18skip |newtype {\:r 4.6.1. Division of Polynomials

|qleft }\6skip It is possible to divide one polynomial by another
in essentially the same way that we divide one multiple-precision
integer by another, when arithmetic is being done on polynomials
over a ``field.'' A field $S$ is a commutative ring with identity,
in which exact division is possible as well as the operations
of addition, subtraction, and multiplication; thus means as
usual that whenever $u$ and $v$ are elements of $S$ and $v ≠
0$, there is an element $w$ in $S$ such that $u = vw$. The most
important fields of coefficients that arise in applications
are
$$|indent a) the rational numbers (represented as fractions,
see Section 4.5.1);
|qleft b) the real or complex numbers (represented within a
computer by means of floating-point approximations; see Section
4.2);
|qleft c) the integers modulo $p$ where $p$ is prime (a division
algorithm suitable for large values of $p$ appears in exercise
4.5.2--15);
|qleft d) ``rational functions'' over a field (namely, quotients
of two polynomials whose coefficients are in that field, the
denominator being monic).

|qleft \6skip |cancelindent Of special importance is the field
of integers modulo 2, when the two values 0 and 1 are the only
elements of the field. Polynomials over this field (namely polynomials
modulo 2) have many analogies to integers expressed in binary
notation; and rational functions over this field have striking
analogies to rational numbers whose numerator and denominator
are represented in binary notation.
|qleft \quad Given two polynomials $u(x)$ and $v(x)$ over a
field, with $v(x) ≠ 0$, we can divide $u(x)$ by $v(x)$ to obtain
a quotient polynomial $q(x)$ and a remainder polynomial $r(x)$
satisfying the conditions
$$$u(x) = q(x) \cdot v(x) + r(x),\qquad$ deg$(r) <$ deg$(v).\eqno
(1)$
|qctr [It is easy to see that there is at most one pair of polynomials
$\biglp q(x), r(x)|newtype ) |newtype$ satisfying these relations;
for if (1) holds for both $\biglp q↓1(x), r↓1(x)|newtype ) |newtype$
and $\biglp q↓2(x), r↓2(x)|newtype ) |newtype$ and the same
polynomials $u(x), v(x)$, then $q↓1(x)v(x) + r↓1(x) = q↓2(x)v(x)
+ r↓2(x)$, so \biglp $q↓1(x) - q↓2(x)\bigrp v(x) = r↓2(x) -
r↓1(x)$. Now if $q↓1(x) - q↓2(x)$ is nonzero, then deg\biglp
($q↓1 - q↓2) \cdot v|newtype ) |newtype =$ deg$(q↓1 - q↓2) +$
deg$(v) ≥$ deg$(v) >$ deg$(r↓2 - r↓1)$, a contradiction; hence
$q↓1(x) - q↓2(x) = 0$ and $r↓1(x) = 0.]$

The following algorithm, which is essentially the
same as Algorithm 4.3.1D for multiple-precision division but
without any concerns of carries, may be used to determine $q(x)$
and $r(x):$

\algbegin Algorithm D. (Division of polynomials over
a field). Given polynomials
$$$u(x) = u↓mx↑m +\cdot \cdot \cdot + u↓1x + u↓0,\qquad v(x)
= v↓nx↑n +\cdot \cdot \cdot + v↓1x + v↓$
0|qctr \6skip over a field $S$, where $v↓n ≠ 0$ and $m ≥ n ≥
0$, this algorithm finds the polynomials
$$$q(x) = q↓{m-n}x↑{m-n} +\cdot \cdot \cdot + q↓0,\qquad r(x)
= r↓{n-1}x↑{n-1} +\cdot \cdot \cdot + r↓$
0|qctr \6skip over $S$ that satisfy (1).

\algstep D1. [Iterate on $k$.]
Do step D2 for $k = m - n, m - n - 1, \ldotss$, 0; then the
algorithm terminates with $(r↓{n-1}, \ldotss , r↓0) ← (u↓{n-1},
\ldotss , u↓0)$.

\algstep D2. [Division loop.] Set $q↓k ← u↓{n+k}/v↓n$,
and then set $u↓j ← u↓j - q↓kv↓{j-k}$ for $j = n + k - 1, n
+ k - 2, \ldotss$,$k$.\xskip (The latter operation amounts to replacing
$u(x)$ by $u(x) - q↓kx↑kv(x)$, a polynomial of degree <$n +
k.)$

|qleft \12skip |cancelindent \quad An example of Algorithm D
appears below in (5). The number of arithmetic operations is
essentially proportional to $n(m - n + 1)$. For some reason
this procedure has become known as ``synthetic division'' of
polynomials. Note that ezplicit division of coefficients is
only done by $v↓n$; if $v(x)$ is a monic polynomial (with $v↓n
= 1)$, there is no actual division at all. If multiplication
is easier to perform than division it will be preferable to
compute 1/$v↓n$ at the beginning of the algorithm and to multiply
by this quantity in step D2.

We shall occasionally write $u(x)$ mod $v(x)$ for
the remainder $r(x)$ in (1).

\subsectionbegin{Unique factorization domains} If
we restrict consideration to polynomials over a field, we are
not dealing directly with many important cases, such as polynomials
over the integers, or polynomials in several variables. Let
us therefore now consider the more general situation that the
algebraic system $S$ of coefficients is a {\sl unique domain},
not necessarily a field. This means that $S$ is a commutative
ring with identity, and that
$$|indent i) $uv ≠ 0$, whenever $u$ and $v$ are nonzero elements
of $S$;
|qleft ii) every nonzero element $u$ of $S$ is either a ``unit''
or has a ``unique'' representation of the form
$$$\quad u = p↓1 \ldotsm p↓t,\qquad t ≥ 1,\eqno (2)$
|qctr \quad where $p↓1, \ldotss$, $p↓t$ are ``primes.''
|qleft Here a ``unit'' $u$ is an element such that $uv = 1$
for some $v$ in $S$; and a ``prime'' $p$ is a nonunit element
such that the equation $p = qr$ can be true only if either $q$
or $r$ is a unit. The representation (2) is to be unique in
the sense that if $p↓1 \ldotsm p↓t = q↓1 \ldotsm q↓s$, where all
the $p$'s and $q$'s are primes, then $s = t$ and there is a
permutation $π↓1\ldotsm π↓t$ such that $p↓1 = a↓1q↓|≤p|infinf
1, \ldotss$, $p↓t = a↓tq↓|≤p|infinf t$ for some units $a↓1, \ldotss
$, $a↓t$. In other words, factorization into primes is unique,
except for unit multiples and except for the order of the factors.

Any field is a unique factorization domain, in
which each nonzero element is a unit and there are no primes.
The integers form a unique factorization domain in which the
units are +1 and -1, and the primes are \pm 2, \pm 3, \pm 5,
\pm 7, etc. The case that $S$ is the set of all integers is
of principal importance, because it is often preferable to work
with integer coefficients instead of arbitrary rational coefficients.
|qleft ???|newtype 58320---C
%folio 520 galley 2 (C) Addison-Wesley 1978	-
\def\\#1({\mathop{\hjust{#1}}(}
omputer Programming\quad (A.-W./Knuth)\quad ch. 4\quad f. 520\quad
g. 2
It is an important fact (see exercise 10) that
{\sl the polynomials over a unique factorization domain form
a unique factorization domain.} A polynomial that is ``prime''
in this domain is usually called asn {\sl irreducible polynomial.}
By using the unique factorization theorem repeatedly, we can
prove that multivariate polynomials over the integers, or over
any field, in any number of variables, can be uniquely factored
into irreducible polynomials. For example, the multivariate
polynomial $90x↑3 - 120x↑2y + 18x↑2yz - 24xy↑2z$ over the integers
is the product of five irreducible polynomials $2 \cdot 3 \cdot
x \cdot (3x - 4y) \cdot (5x + yz)$. The same polynomial, as
a polynomial over the rationals, is the product of three irreducible
polynomials $(6x) \cdot (3x - 4y) \cdot (5x + yz)$; this factorization
can also be written $x \cdot (90x - 120y) \cdot (x + {1\over
5}yz)$ and in infinitely many other ways, although the factorization
is essentially unique.

As usual, we say that $u(x)$ is a multiple of $v(x)$,
and $v(x)$ is a divisor of $u(x)$, if $u(x) = v(x)q(x)$ for
some polynomial $q(x)$. If we have an algorithm to tell whether
or not $u$ is a multiple of $v$ for arbitrary elements $u$ and
$v ≠ 0$ of a unique factorization domain $S$, and to determine
$w$ if $u = v \cdot w$, then Algorithm A gives us a method to
tell whether or not $u(x)$ for arbitrary polynomials $u(x)$
and $v(x) ≠ 0$ over $S$. For if $u(x)$ is a multiple of $v(x)$,
it is easy to see that $u↓{n+k}$ must be a multiple of $v↓n$
each time we get to set D2, hence the quotient $u(x)/v(x)$ will
be found.\xskip (Applying this observation repeatedly, we obtain an
algorithm that decides if a given polynomial over $S$, in a\quad
A set of elements of a unique factorization domain is said to
be {\sl relatively prime} if no prime of that unique factorization
domain divides all of them. A polynomial over a unique factorization
domain is called {\sl primitive} if its coefficients are relatively
prime. (This concept should not be confused with the quite different
idea of ``primitive polynomials modulo $p$'' discussed in Section
3.2.2.) The following fact is of prime importance:

\algbegin Lemma G (\rm Gauss's Lemma). {\sl The product of
primitive polynomials over a unique factorization domain is
primitive.}

|qleft \12skip Proof. \xskip Let $u(x) = u↓mx↑m +\cdots
u↓0$ and $v(x) = v↓nx↑n +\cdots + v↓0$ be primitive
polynomials. If $p$ is any prime of the domain, we must show
that $p$ does not divide all the coefficients of $u(x)v(x)$.
By assumption, there is an index $j$ such that $u↓j$ is not
divisible by $p$, and an index $k$ such that $v↓k$ is not divisible
by $p$. Let $j$ and $k$ be as small as possible; then the coefficient
of $x↑{j+k}$ in $u(x)v(x)$ is $u↓jv↓k + u↓{j+1}v↓{k-1} +
\cdots + u↓{j+k}v↓0 + u↓{j-1}v↓{k+1} +\cdots
+ u↓0v↓{k+j}$, and this is not a multiple of $p$ (since its
first term isn't, but all of its other terms are).

|qleft \6skip \quad If a nonzero polynomial $u(x)$ over $S$
is not primitive, we can write $u(x) = p↓1 \cdot u↓1(x)$, where
$p↓1$ is a prime of $S$ dividing all the coefficients of $u(x)$,
and where $u↓1(x)$ is another nonzero polynomial over $S$. All
of the coefficients of $u↓1(x)$ have one less prime factor than
the corresponding coefficients of $u(x)$. Now if $u↓1(x)$ is
not primitive, we can write $u↓1(x) = p↓2 \cdot u↓2(x)$, etc.,
and this process must ultimately terminate in a representation
$u(x) = c \cdot u↓k(x)$, where $c$ is an element of $S$ and
$u↓k(x)$ is primitive. In fact, we have the following lemma:

\thbegin Lemma H. {\sl Any nonzero polynomial $u(x)$ over
a unique factorization domain $S$ can be factored in the form
u(x) = c \cdot v(x), where c is in S and v(x) is primitive.
Furthermore, this representation is unique, in the sense that
if u = c↓1 \cdot v↓1(x) = c↓2 \cdot v↓2(x), then c↓1 = ac↓2
and v↓2(x) = av↓1(x) where a is a unit of S.}

|qleft \12skip Proof. \xskip We have shown that such a representation
exists, and so only the uniqueness needs to be proved. Assume
that $c↓1 \cdot v↓1(x) = c↓2 \cdot v↓2(x)$, where $v↓1(x)$ and
$v↓2(x)$ are primitive and $c↓1$ is not a unit multiple of $c↓2$.
By unique factorization there is a prime $p$ of $S$ and an exponent
$k$ such that $p↑k$ divides one of $\{c↓1, c↓2\}$ but not the
other, say $p↑k$ divides $c↓1$ but not $c↓2$. Then $p↑k$ divides
all of the coefficients of $c↓2 \cdot v↓2(x)$, so $p$ divides
all the coefficients of $v↓2(x)$, contradicting the assumption
that $v↓2(x)$ is primitive. Hence $c↓1 = ac↓2$, where $a$ is
a unit; and 0 = $ac↓2 \cdot v↓1(x) - c↓2 \cdot v↓2(x) = c↓2
\cdot (av↓1(x) - v↓2(x)|newtype ) |newtype$ implies that $av↓1(x)
- v↓2(x) = 0$.

|qleft \12skip \quad Therefore we may write any nonzero polynomial
$u(x)$ as
$$$u(x) =$ cont($u) \cdot$ pp\biglp u(x)\bigrp ,\eqno (3)
|qctr \7skip |newtype where cont$(u)$, the ``content'' of $u$,
is an element of $S$, and pp|newtype$ (|newtype u(x)\bigrp $,
the ``primitive part'' of $u(x)$, is a primitive polynomial
over $S$. When $u(x) = 0$, it is convenient to define cont$(u)
=$ pp\biglp u(x)|newtype ) |newtype = 0. Combining Lemmas G
and H gives us the relations
$$
|qctr \hfill cont($u \cdot ) ⊗= a$ cont$(u)$ cont$(v),\cr
\4skip \hfill$ pp\biglp $u(x) \cdot v(x)) ⊗|newtype = b$ pp$\biglp
u(x)|newtype ) |newtype$ pp\biglp $v(x)\bigrp ,\eqno (4)\cr
\6skip$ where $a$ and $b$ are units, depending on $u$ and $v$,
with $ab = 1$. When we are working with polynomials over the
integers, the only units are +1 and -1, and it is conventional
to define pp$\biglp u(x)|newtype ) |newtype$ so that its leading
coefficient is positive; then (4) is true with $a = b = 1$.
When working with polynomials over a field we may take cont$(u)
= ??3(u)$, so that pp$\biglp u(x)|newtype ) |newtype$ is monic;
in this case again (4) holds with $a = b = 1$, for all $u(x)$
and $v(x)$.

For example, if we are dealing with polynomials
over the integers, let $u(x) = -26x↑2 + 39, v(x) = 21x + 14$.
Then
$$
|qctr \hfill cont$(u) ⊗= -13,\hfill$ pp$\biglp u(x)|newtype
)10} ⊗= 2x↑2 - 3,\cr
\4skip \hfill$ cont$(v) ⊗= 7,\hfill$ pp$\biglp v(x)\bigrp ⊗=
3x + 2,\cr
\4skip \hfill$ cont$(u \cdot v) ⊗= -91,\hfill$ pp\biglp u(x)
\cdot v(x)\bigrp ⊗= 6x$↑3 + 4x↑2 - 9x - 6.\cr
?????????|newtype 58320---Computer$
%folio 522 galley 3 (C) Addison-Wesley 1978	-
\def\\#1({\mathop{\hjust{#1}}(}
Programming\quad (A.-W./Knuth)\quad ch. 4\quad f. 522\quad g.
3

\subsectionbegin{Greatest common divisors} When there
is unique factorization, it makes sense to speak of a ``greatest
common divisor'' of two elements; this is a common divisor that
is divisible by as many primes as possible. (Cf.\ Eq.\ 4.5.2--6.)
Since a unique factorization domain may have many units, there
is a certain amount of ambiguity in this definition of greatest
common divisor; if $w$ is a greatest common divisor of $u$ and
$v$, so is $a \cdot w$, when $a$ is a unit. Conversely, the
assumption of unique factorization implies that if $w↓1$ and
$w↓2$ are both greatest common advisors of $u$ and $v$, then
$w↓1 = a \cdot w↓2$ for some unit $a$. Therefore it does not
make sense, in general, to speak of ``the'' greatest common
divisor of $u$ and $v$; there is a set of greatest common divisors,
each one being a unit multiple of the others.

Let us now consider the problem of finding a greatest
common divisor of two given polynomials over an algebraic system
$S$. If $S$ is a field, the problem is relatively simple; our
division algorithm, Algorithm D\null, can be extended to an algorithm
that computes greatest common divisors, just as Euclid's algorithm
(Algorithm 4.5.2A) yields the greatest common divisor of two
given integers based on a division algorithm for integers: If
$v(x) = 0$, then gcd$\biglp u(x), v(x)\bigrp = u(x)$; otherwise
gcd$(|newtype u(x), v(x)\bigrp =$ gcd\biglp $v(x), r(x)\bigrp
$, where $r(x)$ is given by (1). This procedure is called Euclid's
algorithm for polynomials over a field; it was first used by
Simon Stevin in 1585 [[????????????12})|newtype , where $r(x)$
is given by (1). This procedure is called Euclid's algorithm
for polynomials over a field; it was first used by Simon Stevin
in 1585 [{\sl Les uvres math|spose 1ematiques de Simon Stevin},
ed.\ by A. Girard, {\bf 1} (Leyden, 1634), 56].

For example, let us determine the gcd of $x↑8 +
x↑6 + 10x↑4 + 10x↑3 + 8x↑2 + 2x + 8$ and $3x↑6 + 5x↑4 + 9x↑2
+ 4x + 8$, mod 13, by using Euclid's algorithm for polynomials
over the integers modulo 13. First, writing only the coefficients
to show the steps of Algorithm D\null, we have
$$
|qctr ⊗⊗\qquad \qquad \qquad \qquad 9 0 7\cr
\4skip ⊗3 0 5 0 9 4 8 ) 1 0 1 0 10 10 8 2 8\cr
⊗⊗1 0 6 0 3 10 7\cr
\6skip ⊗⊗\quad 0 8 0 7 0 1 2 8\cr
⊗⊗\qquad 8 0 9 0 11 2 4\cr
⊗⊗\qquad \quad 0 11 0 3 0 4\eqno (5)\cr
\12skip and hence
$$$x↑8 + x↑6 + 10x↑4 + 10x↑3 + 8x↑2 + 2x + 8$

|qleft \3skip = (9x$↑2 + 7)(3x↑6 + 5x↑4 + 9x↑2 + 4x + 8) + 11x↑4
+ 3x↑2 + 4$.
|qright \6skip $3x↑6 + 5x↑4 |tab + 9x↑2 + 4x + 8 |tab = (5x↑2
+ 5)(11x↑4 + 3x↑2 + 4) + 4x + 1;⊗\4skip \hfill 11x↑4 ⊗+ 3x↑2
+ ⊗= (6x↑3 + 5x↑2 + 6x + 5)(4x + 1) + 12;\cr
\4skip ⊗\hfill 4x + 1 ⊗= (9x + 12) \cdot 12 + 0.\eqno (6)\cr
\6skip$ (The equality sign here means congruence modulo 13,
since all arithmetic on the coefficients has been done mod 13.)
This computation shows that 12 is a greatest common divisor
of the two original polynomials. Now any nonzero element of
a field is a unit of the domain of polynomials over that field,
so any nonzero multiple of a greatest common divisor is also
a greatest common divisor (over a field). It is therefore conventional
in this case to divide the result of the algorithm by its leading
coefficient, producing a {\sl monic} polynomial that is called
{\sl the} greatest common advisor of the two given polynomials.
The gcd computed in (6) is accordingly taken to be 1, not 12.
The last step in (6) could have been omitted, for if deg$(v)
= 0$, then gcd$\biglp u(x), v(x)\bigrp = 1$, no matter what
polynomial is chosen for $u(x)$. Exercise 4 determines the average
running time for Euclid's algorithm on random polynomials modulo
$p$.

|qleft \12skip \quad Let us now turn to the more general situation
in which our polynomials are given over a unique factorization
domain that is not a field. From Eqs.\ (4) we can deduce the
important relations
$$
|qctr \hfill cont\biglp gcd($u,v)\bigrp ⊗= a \cdot$ gcd\biglp
cont$(u)$, cont$(v)\bigrp ,\cr
\4skip \hfill$ pp\biglp gcd$(u(x), v(x))\bigrp ⊗= b \cdot$ gcd\biglp
pp$(u(x))$, pp\biglp $v(x))\bigrp ,\eqno (7)\cr
\6skip$ where $a$ and $b$ are units. Here gcd\biglp $u(x), v(x)\bigrp$
denotes any particular polynomial in $x$ that is a greatest
common divisor of $u(x)$ and $v(x)$. Equations (7) reduce the
problem of finding greatest common divisors of arbitrary polynomials
to the problem of finding greatest common divisors of {\sl primitive}
polynomials.

Algorithm D for division of polynomials over a
field can be generalized to a pseudo-division of polynomials
over any algebraic system that is a commutative ring with identity.
We can observe that Algorithm D requires explicit division only
by $??3(v)$, the leading coefficient of $v(x)$, and that step
D2 is carried out exactly $m - n + 1$ times; thus if $u(x)$
and $v(x)$ start with integer coefficients, and if we are working
over the rational numbers, then the only denominators that
appear in the coefficients of $q(x)$ and $r(x)$ are divisors
of $??3(v)↑{m-n+1}$. This suggests that we can always find polynomials
$q(x)$ and $r(x)$ such that
$$$??3(v)↑{m-n+1}u(x) = q(x)v(x) + r(x),\qquad$ deg$(r) < n,\eqno
(8)$
|qctr \6skip where $m =$ deg$(u), n =$ deg$(v)$, and $m ≥ n$,
for any polynomials $u(x)$ and $v(x) ≠ 0$.

\algbegin Algorithm R (Pseudo-division of polynomials).
Given polynomials
$$$u(x) = u↓mx↑m +\cdots + u↓1x + u↓0,\qquad v(x)
= v↓nx↑n +\cdots + v↓1(x) + v↓0$,
|qctr \6skip where $v↓n ≠ 0$ and $m ≥ n ≥ 0$, this algorithm
finds polynomials $q(x) = q↓{m-n}x↑{m-n} +\cdots
+ q↓0$ and $r(x) = r↓{n-1}x↑{n-1} +\cdots + r↓0$
satisfying (8).

\algstep R1. [Iterate on $k$.] Do step
R2 for $k = m - n, m - n - 1, \ldotss$, 0; then the algorithm
terminates with $u↓{n-1} = r↓{n-1}, \ldotss$, $u↓0 = r↓0$.

\algstep R2. [Multiplication loop.] Set $q↓k ← u↓{n+k}v↑{k}↓{n}$,
and then set $u↓j ← v↓nu↓j - u↓{n+k}v↓{j-k}$ for $j = n + k
- 1, n + k - 2, \ldotss$, 0.\xskip (When $j < k$ this means that $u↓j
← v↓nu↓j$, since we treat $v↓{-1}, v↓{-2}, \ldotss $, as zero;
these multiplications could have been avoided if we had started
by replacing $u↓t$ by $v↑{m-n-t}↓{n}u↓t$ for $0 ≤ t < m - n.)$
|qleft
%folio 524 galley 4 (C) Addison-Wesley 1978	-
\def\\#1({\mathop{\hjust{#1}}(}
??????|newtype 58320---Computer Programming\quad (A.-W./Knuth)\quad
ch. 4\quad f. 524\quad g. 4

|qleft \12skip \quad An example calculation appears below in
(10); it is easy to prove the validity of Algorithm R by induction
on $m - n$, since each execution of step R2 essentially replaces
$u(x)$ by $??3(v)u(x) - ??3(u)x↑kv(x)$, where $k =$ deg$(u)
$- deg$(v)$. Note that no division whatever is used in this
algorithm; the coefficients of $q(x)$ and $r(x)$ are them selves
certain polynomial functions of the coefficients of $u(x)$ and
$v(x)$. If $v↓n = 1$, the algorithm is identical to Algorithm
D\null. If $u(x)$ and $v(x)$ are polynomials over a unique factorization
domain, we can prove as before that the polynomials $q(x)$ and
$r(x)$ are unique; therefore another way to do the pseudo-division
over a unique factorizarion domain is to multiply $u(x)$ by
$v↑{m-n+1}↓{n}$ and apply Algorithm D\null, knowing that all the
quotients in step D2 will exist.

Algorithm R can be extended to a ``generalized
Euclidean algorithm'' for primitive polynomials over a unique
factorization domain, in the following way: Let u(x)$ and $v(x)$
be primitive polynomials with deg$(u) ≥$ deg$(v)$, and determine
$r(x)$ satisfying (8) by means of Algorith R. Now gcd\biglp
$u(x), v(x)\bigrp =$ gcd\biglp $v(x), r(x)\bigrp :$ For any
common divisor of $u(x)$ and $v(x)$ divides $v(x)$ and $r(x)$;
conversely, any common divisor of $v(x)$ and $r(x)$ divides
$??3(v)↑{m-n+1}u(x)$, and it must be primitive \biglp since
$v(x)$ is primitive\bigrp so it divides $u(x)$. If $r(x) = 0$,
we therefore have gcd\biglp $u(x), v(x)\bigrp = v(x)$; if $r(x)
≠ 0$, we have gcd\biglp $v(x), r(x)\bigrp =$ gcd\biglp $v(x)$,
pp$(r(x))\bigrp$ since $v(x)$ is primitive, so the process can
be iterated.

\algbegin Algorithm E (Generalized Euclidean algorithm).
Given nonzero polynomials $u(x)$
and $v(x)$ over a unique factorization domain $S$, this algorithm
calculates a greatest common divisor of $u(x)$ and $v(x)$. We
assume that auxiliary algorithms exist to calculate greatest
common divisors of elements of $S$, and to divide $a$ by $b$
in $S$ when $b ≠ 0$ and $a$ is a multiple of $b$.

\algstep E1. [Reduce to primitive.] Set
$d ←$ gcd\biglp cont$(u)$, cont$(v)\bigrp $, using the assumed
algorithm for calculating greatest common divisors in $S$. (Note
that cont$(u)$ is a greatest common divisor of the coefficients
of $u(x)\bigrp $. Replace $u(x)$ by the polynomial $u(x)/$cont$(u)
=$ pp\biglp u(x)\bigrp ; similarly, replace $v(x)$ by pp\biglp
v(x)\bigrp .

\algstep E2. [Pseudo-division.] Calculate $r(x)$ using Algorithm
R. \biglp It is unnecessary to calculate the quotient polynomial
$q(x).\bigrp$ If $r(x) = 0$, go to E4. If deg$(r) = 0$, replace
$v(x)$ by the constant polynomial ``1'' and go to E4.

\algstep E3. [Make remainder primitive.] Replace $u(x)$ by v(x)
and replace $v(x)$ by pp\biglp $r(x)\bigrp $. Go back to step
E2. (This is the ``Euclidean step,'' analogous to the other
instances of Euclid's algorithm that we have seen.)

\algstep E4. [Attach the content.] The algorithm terminates,
with $d \cdot v(x)$ as the answer.

|qleft \12skip |cancelindent |newtype \quad As an example of
Algorithm E\null, let us calculate the greatest common divisor of
$$$u(x) |tab = x↑8 + x↑6 - 3x↑4 - 3x↑3 + 8x↑2 + 2x - 5$,
%folio 526 galley 5 (C) Addison-Wesley 1978	-
\def\\#1({\mathop{\hjust{#1}}(}
To improve that algorithm, we can reduce
$u(x)$ and $v(x)$ to monic polynomials at each step, since this
removes ``unit'' factors that make the coefficients more complicated
than necessary; this is actually Algorithm E over the rationals:
$$\vjust{\baselineskip15pt\halign{$\hfill#$⊗\qquad$\hfill#$\cr
u(x)\hfill⊗v(x)\hfill\cr
\noalign{\vskip2pt}
1,0,1,0,-3,-3,8,2,5⊗3,0,5,0,-4,-9,21\cr
1,0,{5\over3},0,-{4\over3},-3,7⊗1,0,-{1\over5},0,{3\over5}\cr
1,0,-{1\over5},0,{3\over5}⊗1,{25\over13},-{49\over13}\cr
1,{25\over13},-{49\over13}⊗1,-{6150\over4663}\cr
1,-{6150\over4663}⊗1\cr}}\eqno(14)$$

In both (13) and (14) the sequence of polynomials
is essentially the same as (12), which was obtained by Algorithm
E over the integers; the only difference is that the polynomials
have been multiplied by certain rational numbers. Whether we
have $5x↑4 - x↑2 + 3$ or $-{5\over 3}$$x↑4 + {1\over 9}x↑2 -
{1\over 9}$ or $x↑4 - {1\over 5}x↑2 + {3\over 5}$, the computations
are essentially the same. But either algorithm using rational
arithmetic will run noticeably slower than the all-integer Algorithm
E\null, since rational arithmetic requires many more evaluations
of gcd's of integers within each step. Therefore it is definitely
preferable to use the all-integer algorithm instead of rational
arithmetic, when the gcd of polynomials with integer or rational
coefficients is desired.

It is also instructive to compare the above calculations
with (6) above, where we determined the gcd of the same polynomials
$u(x)$ and $v(x)$ modulo 13 with considerably less labor. Since
$??3(u)$ and $??3(v)$ are not multiples of 13, the fact that
gcd\biglp $u(x), v(x)\bigrp = 1$ modulo 13 is sufficient to
prove that $u(x)$ and $v(x)$ are relatively prime over the integers
(and therefore over the rational numbers); we will return to
this time-saving observation at the close of Section 4.6.2.

\subsectionbegin{Collins's Algorithm} An ingenious
algorithm that is generally superior to Algorithm E\null, and that
gives us further information about Algorithm E's behavior, has
been discovered by George E. Collins [{\sl JACM \bf 14} (1967),
128--142]. His algorithm avoids the calculation of primitive
part in step E3, dividing instead by an element of $S$ that
is known to be a factor of $r(x):$

\algbegin Algorithm C (Greatest common divisor over a unique
factorization domain). This algorithm has the same input
and output assumptions as Algorithm E\null, and has the advantage
that fewer calculations of greatest common divisors of coefficients
are needed, although it may have to deal with larger numbers
as a result.

\algstep C1. [Reduce to primitive.] As in
step E1 of Algorithm E\null, set $d ←$ gcd\biglp cont$(u)$, cont$(v)\bigrp
$, and replace \biglp $u(x), v(x)\bigrp$ by \biglp pp$(u(x))$,
pp$(v(x))\bigrp $. Also set $a ← 1$.

\algstep C2. [Pseudo-division.] Set $b ← ??3(v)↑{$deg$(u)$-deg$(v)+1}$.
Calculate $r(x)$ using Algorithm R\null. If $r(x) = 0$, go to C4.
If deg$(r) = 0$, replace $v(x)$ by the constant polynomial ``1''
and go to C4.

\algstep C3. [Adjust remainder.] Replace the polynomial $u(x)$
by $v(x)$, and replace $v(x)$ by $r(x)/a$. (At this point all
coefficients of $r(x)$ are multiples of $a.)$ Also set $a ←
b$; then return to C2.

\algstep C4. [Attach the content.] The algorithm terminates,
with $d \cdot$ pp\biglp $v(x)\bigrp$ as the answer.
|qleft |cancelindent \12skip \quad If we apply this algorithm
to the polynomials (9) considered earlier, the following sequence
of results is obtained:
$$|tab \qquad |tab \qquad \qquad \qquad \qquad \qquad \qquad
|tab \qquad |tab \qquad \qquad \qquad \qquad \quad |tab \qquad
|tab \qquad \quad |tab \qquad \quad \xskip |tab |zfa ⊗$⊗ u(x)⊗⊗
v(x)⊗⊗ a⊗\cr
\2skip ⊗1, 0, 1, 0, -3, -3, 8, 2, -5⊗⊗3, 0, 5, 0, -4, -9, 21⊗⊗1 \xskip
⊗\cr
⊗3, 0, 5, 0, -4, -9, 21⊗⊗-15, 0, 3, 0, -9⊗⊗27 \xskip ⊗\cr
⊗-15, 0, 3, 0, -9⊗⊗585, 1125, -2205⊗⊗(-15)↑3⊗\cr
⊗585, 1125, -2205⊗⊗-18885150, 24907500⊗⊗(585)↑3⊗(15)\cr
\12skip$ At the conclusion of the algorithm, $r(x)/a = 527933700$.

The sequence of polynomials consists of integral
multiples of the polynomials in the sequence produced by Algorithm
E\null. The numbers in (15) actually constitute an unusually bad
example of the efficiency of Algorithm C\null, for reasons explained
later; but our example shows that, in spite of the fact that
the polynomials are not reduced to primitive form, the coefficients
are kept to a reasonable size because the reduction factor $a$
is large.

In order to analyze Algorithm C and to prove that
it is valid, let us call the sequence of polynomials it produces
$u↓1(x), u↓2(x), u↓3(x), \ldotss $, where $u↓1(x) = u(x)$ and
$u↓2(x) = v(x)$. Let $t↓1 = 0$ and let $t↓{j+1} = n↓j - n↓{j+1}
+ 1$, where $n↓j =$ deg$(u↓j), j ≥ 1$. Then if $??3↓j = ??3(u↓j)$,
we have
$$$??3↑{t↓2}↓{2}u↓1(x) = u↓2(x)q↓1(x) + ??3↑{t↓1}↓{1}u↓3(x),\qquad
n↓3 < n↓2$;
|qctr \4skip ??3↑{t$↓3}↓{3}u↓2(x) = u↓3(x)q↓2(x) + ??3↑{t↓2}↓{2}u↓4(x),\qquad
n↓4 < n↓3;\eqno (16)$
|qctr \4skip ??3↑{t$↓4}↓{4}u↓3(x) = u↓4(x)q↓3(x) + ??3↑{t↓3}↓{3}u↓5(x),\qquad
n↓5 < n↓4$;
|qctr \6skip and so on. The process terminates when $n↓{k+1}
=$ deg$(u↓{k+1}) ≤0$. We must show that $u↓4(x), u↓5(x), \ldotss
$, have coefficients in $A$, i.e., that the factor $??3↑{t↓{k+1}}↓{k+1}$
introduced into the dividend in the $k$th step can be divided
from the remainder in the $(k + 1)$st step. The proof is rather
involved, and it can be most easily understood by considering
an axemple.

|qleft \12skip {\:r Table 1
}|qctr \3skip |newtype COEFFICIENTS IN COLLINS'S ALGORITHM
|qctr \6skip |tab \qquad \quad |tab \qquad \qquad \qquad \qquad
\qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad
\qquad |tab \qquad \qquad |tab \qquad \quad |tab |zfa ⊗Row⊗⊗Multiply⊗Replace\cr
name⊗Row⊗by⊗by row\cr
%folio 528 galley 6 Beginning lost. (C) Addison-Wesley 1978	-
|linerule |tab \qquad \quad |tab \qquad |tab \qquad ⊗\56skip
(17)|zfa ⊗\-56skip A$↓2⊗a↓8⊗a↓7⊗a↓6⊗a↓5⊗a↓4⊗a↓3⊗a↓2⊗a↓1⊗a↓0⊗0⊗0⊗\cr
\4skip A↓1⊗0⊗a↓8⊗a↓7⊗a↓6⊗a↓5⊗a↓4⊗a↓3⊗a↓2⊗a↓1⊗a↓0⊗0⊗\cr
\4skip A↓0⊗0⊗0⊗a↓8⊗a↓7⊗a↓6⊗a↓5⊗a↓4⊗a↓3⊗a↓2⊗a↓1⊗a↓0⊗\cr
\4skip B↓4⊗b↓6⊗b↓5⊗b↓4⊗b↓3⊗b↓2⊗b↓1⊗b↓0⊗0⊗0⊗0⊗0⊗ = M.\cr
\4skip B↓3⊗0⊗b↓6⊗b↓5⊗b↓4⊗b↓3⊗b↓2⊗b↓1⊗b↓0⊗0⊗0⊗0⊗\cr
\4skip B↓2⊗0⊗0⊗b↓6⊗b↓5⊗b↓4⊗b↓3⊗b↓2⊗b↓1⊗b↓0⊗0⊗0⊗\cr
\4skip B↓1⊗0⊗0⊗0⊗b↓6⊗b↓5⊗b↓4⊗b↓3⊗b↓2⊗b↓1⊗b↓0⊗0⊗\cr
\4skip B↓0⊗0⊗0⊗0⊗0⊗b↓6⊗b↓5⊗b↓4⊗b↓3⊗b↓2⊗b↓1⊗b↓0⊗\cr
\6skip$ The indicated row operations and a permutation of rows
will transform $M$ into
$$\56skip (18)|zfa
 |qright \-56skip $B↓4⊗b↓6⊗b↓5⊗b↓4⊗b↓3⊗b↓2⊗b↓1⊗b↓0⊗0⊗0⊗0⊗0⊗\cr
\4skip B↓3⊗0⊗b↓6⊗b↓5⊗b↓4⊗b↓3⊗b↓2⊗b↓1⊗b↓0⊗0⊗0⊗0⊗\cr
\4skip B↓2⊗0⊗0⊗b↓6⊗b↓5⊗b↓4⊗b↓3⊗b↓2⊗b↓1⊗b↓0⊗0⊗0⊗\cr
\4skip B↓1⊗0⊗0⊗0⊗b↓6⊗b↓5⊗b↓4⊗b↓3⊗b↓2⊗b↓1⊗b↓0⊗0⊗ = M↑\prime .\cr
\4skip C↓2⊗0⊗0⊗0⊗0⊗c↓4⊗c↓3⊗c↓2⊗c↓1⊗c↓0⊗0⊗0⊗\cr
\4skip C↓10⊗0⊗0⊗0⊗0⊗c↓4⊗c↓3⊗c↓2⊗c↓1⊗c↓0⊗0⊗\cr
\4skip C↓0⊗0⊗0⊗0⊗0⊗0⊗0⊗c↓4⊗c↓3⊗c↓2⊗c↓1⊗c↓0⊗\cr
\4skip D↓0⊗0⊗0⊗0⊗0⊗0⊗0⊗0⊗0⊗d↓2⊗d↓1⊗d↓0⊗\cr
\6skip$ Because of the way $M↑\prime$ has been derived from
$M$, we have
$$$b|spose ↓6↑3 \cdot b|spose ↓6↑3 \cdot b|spose ↓6↑3 \cdot
(c|spose ↓4↑3/b|spose ↓6↑3) \cdot$ det $M↓0 = \pm$ det $M↑\prime↓{0},\eqno
(19)$
|qctr \6skip if $M↓0$ and $M↑\prime↓{0}$ represent any square
matrices obtained by selecting eight corresponding columns from
$M$ and $M↑\prime $. For example, let us select the first seven
columns and the column containing $d↓1$; then
$$
|qctr ⊗a$↓8⊗a↓7⊗a↓6⊗a↓5⊗a↓4⊗a↓3⊗a↓2⊗a↓1⊗a↓0⊗0⊗\cr
\4skip ⊗0⊗a↓8⊗a↓7⊗a↓6⊗a↓5⊗a↓4⊗a↓3⊗a↓2⊗a↓1⊗a↓0⊗\cr
\4skip ⊗0⊗0⊗a↓8⊗a↓7⊗a↓6⊗a↓5⊗a↓4⊗a↓1⊗\cr
\4skip b|spose ↓6↑3 \cdot b|spose ↓6↑3 \cdot b|spose ↓6↑3 \cdot
(c|spose ↓4↑3/b|spose ↓6↑3) \cdot$ det⊗$b↓6⊗b↓5⊗b↓4⊗b↓3⊗b↓2⊗b↓1⊗b↓0⊗0⊗
= \pm b|spose ↓6↑4 \cdot c|spose ↓4↑3 \cdot d↓1.\cr
\4skip ⊗0⊗b↓6⊗b↓5⊗b↓4⊗b↓3⊗b↓2⊗b↓1⊗0⊗\cr
\4skip ⊗0⊗0⊗b↓6⊗b↓5⊗b↓4⊗b↓3⊗b↓2⊗0⊗\cr
\4skip ⊗0⊗0⊗0⊗b↓6⊗b↓5⊗b↓4⊗b↓3⊗b↓0⊗\cr
\4skip ⊗0⊗0⊗0⊗0⊗b↓6⊗b↓5⊗b↓4⊗b↓1⊗(20)\cr
\6skip$ Since $b↓6c↓4 ≠ 0$, this proves that $d↓1$ is an integer;
in fact, $d↓1$ is a multiple of $b|spose ↓6↑2$. Similarly, ??εd$↓2$
and $d↓0$ are integer multiples of $b|spose ↓6↑2$. This accounts
for the fact that the numbers 585, 1125, -2205 in (15) are multiples
of 9.

In general, we can show that $u↓{j+1}(x)$ has integer
coefficients in a similar manner. If we start with the matrix
$M$, consisting of rows $A↓n|infinf 2↓{-n}|infinf j$ through
$A↓0$ and $B↓n|infinf 1↓{-n}|infinf j$ through $B↓0$, and if
we perform the row operations indicated in Table 1, we will
obtain a matrix $M↑\prime$ consisting in some order of rows
$B↓n|infinf 1↓{-n}|infinf j$ through $B↓n|infinf 3↓{-n}|infinf
j↓{+1}, C↓{n↓2-n↓j}$ through $C↓{n↓4-n↓j+1}, \ldotss$, $F↓{n↓{j-2}-n↓j}$
through $F↓1, G↓{n↓{j-1}-n↓j$ through $G↓0$, and finally $H↓0
[$a row containing the coefficients of $u↓{j+1}(x)]$. Extracting
appropriate columns shows that
$$$\quad (??3↑{t↓2}↓{2})↑{n↓2-n↓j+1}(l↑{t↓3}↓{3}/??3↑{t↓2}↓{2})↑{n↓3-n↓j+1}
\cdots (??3↑{t↓j}↓{j}/??3↑{t↓{j-1}}↓{j-1})↑{n↓j-n↓j+1}$ det
$M↓$

|qleft 0\4skip = \pm ??3↑{n$↓1-n↓3}↓{2}??3↑{n↓2-n↓4}↓{3} \cdots
??3??3↑{n↓{j-2}-n↓j}↓{j-1}??3↑{n↓{j-1}-n↓j+1}↓{j}h↓i,\quad (21)$
|qright \6skip where $h↓i$ is a given coefficient of $u↓{j+1}(x)$
and $M↓0$ is a submatrix of $M$. Thus {\sl every coefficient
of u↓{j+1}(x) is a multiple of}
$$??3↑{(t$↓2-1)(t↓3-2)}↓{2}??3↑{(t↓3-1)(t↓4-2)}↓{3} \cdots ??3↑{(t↓{j-1}-1)(t↓j-2)}↓{j-1}.??a(22)$
|qctr \6skip (This proof, although stated for the domain of
integers, obviously applies to any unique factorization domain.)

The quantity (22) appears to be an extremely large
number that makes the coefficients of $u↓{j+1}(x)$ much larger
than they need to be. It would be possible to requte Algorithm
C in an appropriate manner so that this factor would be eliminated.
But actually {\sl the quantity (22) is almost always equal to
unity{\sl ! }}This follows from the fact that $t↓3, t↓4, \ldotss
$, $t↓k$ will all be equal to 2, for almost every given pair of
polynomials $u↓1(x), u↓2(x)$; the particular polynomials in
example (15) are very exceptional indeed, since both $t↓3$ and
$t↓4$ are equal to 3. In order to prove this fact, note for
example that we could have chosen the first eight columns of
$M$ and $M↑\prime$ in (17) and (18), and then we would have
found in place of Eq.\ (19) that $u↓4(x)$ has degree less than
3 if and only if $d↓3 = 0$, that is, if and only if
$$
|qctr \56skip (23)|zfa
 |qright \-56skip ⊗$a↓8⊗a↓7⊗a↓6⊗a↓5⊗a↓4⊗a↓3⊗a↓2⊗a↓1⊗\cr
\4skip ⊗0⊗a↓8⊗a↓7⊗a↓6⊗a↓5⊗a↓5⊗a↓4⊗a↓3⊗\cr
\4skip ⊗0⊗0⊗a↓8⊗a↓7⊗a↓6⊗a↓5⊗a↓4⊗a↓3⊗\cr
\4skip$ det⊗$b↓6⊗b↓5⊗b↓4⊗b↓3⊗b↓2⊗b↓1⊗b↓0⊗0⊗ = 0.\cr
\4skip ⊗0⊗b↓6⊗b↓5⊗b↓4⊗b↓3⊗b↓2⊗b↓1⊗b↓0⊗\cr
\4skip ⊗0⊗0⊗b↓6⊗b↓5⊗b↓4⊗b↓3⊗b↓2⊗b↓1⊗\cr
\4skip ⊗0⊗0⊗0⊗b↓6⊗b↓5⊗b↓4⊗b↓3⊗b↓2⊗\cr
\4skip ⊗0⊗0⊗0⊗0⊗b↓6⊗b↓5⊗b↓4⊗b↓3⊗\cr
%folio 530 galley 7 (C) Addison-Wesley 1978	-

In the process of verifying Algorithm C, we have also learned that every element
of $S$ that the algorithm must deal with can be expressed as a determinant whose
entries are the coefficients of the primitive parts of the original polynomials.
A well-known theorem of Hadamard (see exercise 15) states that
$$|\det(a↓{ij})|≤\prod↓{1≤i≤n}\;\bigglp\sum↓{1≤j≤n}a↓{ij}↑2\biggrp↑{1/2};
\eqno(25)$$
therefore an upper bound for the maximum coefficient
appearing in the polynomials computed by Algorithm C is
$$N↑{m+n}(m + 1)↑{n/2}(n + 1)↑{m/2},\eqno (26)$$
if all coefficients of the given polynomials $u(x)$
and $v(x)$ are bounded by $↓N$ in absolute value. This same
upper bound applies to the coefficients of all polynomials $u(x)$ and
$v(x)$ computed during the execution of Algorithm E\null, 
since the polynomials obtained
in Algorithm E are always divisors of the polynomials obtained
in Algorithm C\null.

This upper bound on the coefficients is extremely
gratifying, because it is much better than we would ordinarily
have a right to expect. For example, suppose we would perform Algorithm
E or Algorithm C with {\sl no} correction in step E3 or C3,
just replacing $v(x)$ by $r(x)$. This is the simplest
gcd algorithm, and it is one that traditionally appears in
textbooks on algebra (for theoretical purposes, not intended
for practical calculations).\xskip If we suppose that $\delta↓1=\delta↓2=
\cdots = 1$, we find that the coefficients of $u↓3(x)$
are bounded by $N↑3$, the coefficients of $u↓4(x)$ are bounded
by $N↑7$, those of $u↓5(x)$ by $N↑7$, $\ldotss $, where the size
of the coefficients is $N↑{a↓k}$, for $a↓k = 2a↓{k-1} +
a↓{k-2}$. The upper bound in place of (25) for $m - n + 1$ would
be approximately
$$N↑{0.5(2.414)↑n},\eqno (26)$$
and experiments show that the simple algorithm
does in fact have this behavior; the number of digits in the
coefficients grows exponentially at each step! In Algorithm
E the growth in number of digits is only slightly more than
linear at most.

The considerations above can be used to derive
the well-known fact that two polynomials are relatively prime
if and only if their ``resultant'' is nonzero; the resultant
is a determinant having the form of rows $A↓5$ through $A↓0$
and $B↓7$ through $B↓0$ in Table 1. (This is ``Sylvester's determinant'';
see exercise 12. Further properties of resultants are discussed
in B. L. van der Waerden, {\sl Modern Algebra}, tr. by Fred
Blum (New York: Ungar, 1949), Sections 27--28.) From the standpoint
discussed above, we would say that the gcd is ``almost always''
of degree zero, since Sylvester's determinant is almost never
zero. But many calculations of practical interest would never
be undertaken if there weren't some reasonable chance that the
gcd would be a polynomial of positive degree.

We can see exactly what happens during Algorithms
E and C when the gcd is not 1 by considering $u(x) = w(x)u↓1(x),
v(x) = w(x)u↓2(x)$, where $u↓1(x)$ and $u↓2(x)$ are relatively
prime and $w(x)$ is primitive. Then if the polynomials $u↓1(x),
u↓2(x), u↓3(x), . . $. are obtained when Algorithm E works on
$u(x) = u↓1(x)$ and $v(x) = u↓2(x)$, it is easy to show that
the sequence obtained for $u(x) = w(x)u↓1(x), v(x) = w(x)u↓2(x)$
is simply $w(x)u↓1(x), w(x)u↓2(x), w(x)u↓3(x), w(x)u↓4(x), .
. $. With Algorithm C the behavior is different; if the polynomials
$u↓1(x), u↓2(x), u↓3(x), . . $. are obtained when Algorithm
C is applied to $u(x) = u↓1(x)$ and $v(x) = u↓2(x)$, and if
we assume that deg$(u↓{j+1}) =$ deg($u↓j) - 1$ (which is almost
always true when $j > 1)$, then the sequence
$$$w(x)u↓1(x), w(x)u↓2(x), ??9↑2w(x)u↓3(x), ??9↑4w(x)u↓4(x),
\ldotss , ??9↑6w(x)u↓5(x), . . $.
|qctr \3skip (27)
|qright \6skip is obtained when Algorithm C is applied to $u(x)
= w(x)u↓1(x)$ and $v(x) = w(x)u↓2(x)$, where $??9 = ??9(w)$.\xskip
(See exercise 13.)\xskip So Algorithm E may be superior to Algorithm
C when the primitive part of the greatest common divisor has
a large enough leading coefficient.

Important alternative ways to calculate polynomial
gcd's over the integers are discussed at the end of Section
4.6.2. For a general determinant-evaluation algorithm, which
essentially includes Algorithm C as a special case, see E. H.
Bareiss, {\sl Math.\ Comp.\ \bf 22} (1968), 565--578.

Polynomials remainder sequences such as those in
Algorithm C and E are not useful merly for finding greatest
common divisors; another importnat application is to the enumeration
of real roots, for a given polynomial in a given interval, according
to the famous theorem of J. Sturm [$M|spose 1em.\ pr|spose 1esentes
par divers savants \bf 6} (Paris, 1835), 271--318]. Let $u(x)$
be a polynomial over the real numbers, having distinct roots.
We shall see in the next section that this is the same as saying
gcd$\biglp u(x), u↑\prime (x)\bigrp = 1$, where $u↑\prime (x)$
is the derivative of $u(x)$; accordingly, there is a polynomial
remainder sequence that proves that $u(x)$ is relatively prime
to $u↑\prime (x)$. We set $u↓0(x) = u(x), u↓1(x) = u↑\prime
(x)$, and (following Sturm) we negate the sign of all remainders:
|qleft
 |qctr \19skip (28)|zfa
 |qright \-19skip \hfill c$↓1u↓0(x) ⊗= u↓1(x)q↓1(x) - d↓1u↓2(x),\cr
\4skip \hfill c↓2u↓1 ⊗= u↓2(x)q↓2(x) - d↓2u↓3(x),\cr
\10skip \hfill c↓ku↓{k-1}(x) ⊗= u↓k(x)q↓k(x) - d↓ku↓{k+1}(x),\cr
\6skip$ for some positive constants $c↓j$ and $d↓j$, where deg($u↓{k+1})
= 0$. We say that the {\sl variation V(u, a)} of $u(x)$ at $a$
is the number of changes of sign in the sequence $u↓0(a), u↓1(a),
\ldotss$, $u↓{k+1}(a)$, not counting zeros. For example, if the
sequence of signs is 0, +, -, -, 0, +, +, - we have $V(u, a)
= 3$. Sturm's theorem asserts that {\sl the number of roots
of u(x) in the interval a < x ≤ b is V(u, a) - V(u, b);} and
the proof is surprisingly short (see exercise 22).

|qleft \24skip {\:r EXERCISES

\exno 1. [10] Compute the pseudo-quotient
$q(x)$ and pseudo-remainder $r(x)$, namely, the polynomials
satisfying (8), when $u(x) = x↑6 + x↑5 - x↑4 + 2x↑3 + 3x↑2 -
x + 2$ and $v(x) = 2x↑3 + 2x↑2 - x + 3$, over the integers.

\exno 2. [15] What is the greatest common divisor of $3x↑6 +
x↑5 + 4x↑4 + 4x↑3 + 3x↑2 + 4x + 2$ and its ``reverse'' $2x↑6
+ 4x↑5 + 3x↑4 + 4x↑3 + 4x↑2 + x + 3$, modulo 7?

\exno 3. [M25] Show that Euclid's algorithm for polynomials
over a field $S$ can be extended to find polynomials $U(x),
V(x)$ over $S$ such that
$$$u(x)V(x) + U(x)v(x) =$ gcd$\biglp u(x), v(x)\bigrp $.
|qctr \6skip (Cf.\ Algorithm 4.5.2X.)\xskip What are the degrees of
the polynomials $U(x)$ and $V(x)$ that are computed by this
extended algorithm? Prove that if $S$ is the field of rational
numbers, and if $u(x) = x↑m - 1$ and $v(x) = x↑n - 1$, then
the extended algorithm yields polynomials $U(x)$ and $V(x)$
having {\sl integer} coefficients. Find $U(x)$ and $V(x)$ when
$u(x) = x↑{21} - 1, v(x) = x↑{13} - 1$.

\exno 4. [M30] Let $p$ be prime, and suppose that Euclid's algorithm
applied to the polynomials $u(x)$ and $v(x)$ modulo $p$ yields
a sequence of polynomials having respective degrees $m, n, n↓1,
\ldotss$, $n↓t$, $-∞$, where $m =$ deg$(u), n =$ deg$(v)$, and $n↓t
≥ 0$. Assume that $m ≥ n$. If $u(x)$ and $v(x)$ are monic polynomials,
independently and uniformly distributed over all the $p↑{m+n}$
pairs of monic polynomials having respective degrees $m$ and
$n$, what is the average value of the three quantities $t, n↓1
+\cdots + n↓t, (n - n↓1)n↓1 +\cdots +
(n↓{t-1} - n↓t)n↓t$, as a function of $m, n$, and $p?$ (These
three quantities are the fundamental factors in the running
time of Euclid's algorithm applied to polynomials modulo $p$,
assuming that division is done by Algorithm D.)\xskip[{\sl Hint:}
Show that $u(x)$ mod $v(x)$ is uniformly distributed and
independent of $v(x).]$

\exno 5. [M22] What is the probability that $u(x)$ and $v(x)$
are relatively prime modulo $p$, if $u(x)$ and $v(x)$ are independently
and uniformly distributed monic polynomials of degree $n?$

\exno 6. [M23] We have seen that Euclid's Algorithm 4.5.2A for
integers can be directly adapted to an algorithm for the greatest
common divisor of polynomials. Can the ``binary gcd algorithm,''
Algorithm 4.5.2B\null, be adapted in an analogous way to an algorithm
that applies to polynomials?

\exno 7. [M10] What are the units in the domain of all polynomials
over a unique factorization domain $S?$

|qleft \3skip ??U0}|newtype 58320---Computer Programming\quad
(A.-W
%folio 535 galley 8 (C) Addison-Wesley 1978	-
\def\\#1({\mathop{\hjust{#1}}(}
./Knuth)\quad ch. 4\quad f. 535\quad g. 8

\exno 9. [M25] Let $u(x)$ and
$v(x)$ be primitive polynomials over a unique factorization
domain $S$. Prove that $u(x)$ and $v(x)$ are relatively prime
if and only if there are polynomials $U(x)$ and $V(x)$ such
that $u(x)V(x) + U(x)v(x)$ is a polynomial of degree zero.\xskip [{\sl
Hint:} Extend Algorithm E\null, as Algorithm 4.5.2E is extended
in exercise 3.]

\exno 10. [M28] Prove that the polynomials over a unique factorization
domain form a unique factorization domain.\xskip [{\sl Hint:}
Use the result of exercise 9 to help show that there is at
most one kind of factorization possible.]

\exno 11. [M22] What row names would have appeared in Table
1 if the sequence of degrees had been 9, 6, 5, 2, -∞ instead
of 8, 6, 4, 2, 1, 0?

\exno 12. [M24] Let $u↓1(x), u↓2(x), u↓3(x), . . $. be a sequence
of polynomials obtained by Algorithm C\null. ``Sylvester's matrix''
is the square matrix formed from rows $A↓n|infinf 2↓{-1}$ through
$A↓0, B↓n|infinf 1↓{-1}$ through $B↓0$ (in a notation analogous
to that of Table 1). Show that if $u↓1(x)$ and $u↓2(x)$ have
a common factor of positive degree, then the determinant of
Sylvester's matrix is zero; conversely, given that deg$(u↓k)
= 0$ for some $k$, show that the determinant of Sylvester's
matrix is nonzero by deriving a formula for its absolute value
in terms of $??9(u↓j)$ and deg($u↓j), 1 ≤ j ≤ k$.

\exno 13. [M20] Show that the leading coefficient $??9$ of the
primitive part of gcd\biglp $u(x), v(x)\bigrp$ enters into Algorithm C's
polynomial sequence as in (27), when $t↓2 = t↓3 =\cdots
 = t↓k = 2$.

\exno 14. [M29] Let $r(x)$ be the pseudo-remainder of $u(x)$
pseudo-divided by $v(x)$. If deg$(u) ≥$ deg$(v) + 2$ and deg($v)
≥$ deg$(v) + 2$, show that $r(x)$ is a multiple of $??9(v)$.

\exno 15. [M26] Prove Hadamard's inequality (24).\xskip [{\sl Hint:
Consider the matrix $AA↑T$.]

\exno 16. [HM22] Let $f(x↓1, \ldotss , x↓n)$ be a multivariate
polynomial with real coefficients not all zero, and let $a↓N$
be the number of solutions to the equation $f(x↓1, \ldotss ,
x↓n) = 0$ such that |$x↓1| ≤ N, \ldotss , |x↓n| ≤ N$, and each
$x↓j$ is an integer. Prove that
$$lim↓{$N→∞} a↓N/(2N + 1)↑n = 0$.

\exno 17. [M32] ({\sl P. M. Cohn's algorithm
for division of string polynomials.})\xskip Let $A$ be an ``alphabet,''
i.e., a set of symbols. A {\sl string α} on $A$ is a sequence
of $n ≥ 0$ symbols, $α = a↓1 \ldotsm a↓n$, where each $a↓j$ is
in $A$. The length of $α$, denoted by $|α|$, is the number $n$
of symbols. A {\sl string polynomial} on $A$ is a finite sum
$U = \sum ↓k r↓kα↓k$, where each $r↓k$ is a nonzero rational
number and each $α↓k$ is a string on $A$. The {\sl degree} of
$U$, deg($U)$, is defined to be -∞ if $U = 0$ (i.e., if the
sum is empty), and max $|α↓k|$ if $U ≠ 0$. The sum and product
of string polynomials are defined in an obvious manner, e.g.,
$(\sum ↓j r↓jα↓j)(\sum ↓k s↓kβ↓k) = \sum ↓{j,k} r↓js↓kα↓jβ↓k$,
where the product of two strings is obtained by simply juxtaposing
them. For example, if $A = \{a, b\}, U = ab + ba - 2a - 2b,
V = a + b - 1$, then deg$(U) = 2$, deg($V) = 1, V↑2 = aa + ab
+ ba + bb - 2a - 2b + 1$, and $V↑2 - U = aa + bb + 1$. Clearly,
deg($UV) =$ deg$(U) +$ deg($V)$, deg($U + V) ≤$ max\biglp deg($U)$,
deg$(V)\bigrp $, with equality in the latter formula if deg($U)
≠$ deg$(V)$. (String polynomials may be regarded as ordinary
multivariate polynomials over the field of rational numbers,
except that the variables are {\sl not commutative} under multiplication.
In the conventional language of pure mathematics, the set of
string polynomials with the operations defined here is the free
associative algebra generated by $A$ over the rationals.)

|qleft \3skip \quad a) Let $Q↓1, Q↓2, U, V$ be string polynomials
with deg($U) ≥$ deg$(V)$ and deg($Q↓1U - Q↓2V) <$ deg($Q↓1U)$.
Give an algorithm to find a string polynomial $Q$ such that
deg($U - QV) <$ deg($U)$. (Thus if we are given $U$ and $V$
such that $Q↓1U = Q↓2V + R$ and deg($R) <$ deg($Q↓1U)$, for
some $Q↓1$ and $Q↓2$, there is a solution to these conditions
with $Q↓1 = 1.)$

b) Given that $U$ and $V$ are string polynomials
with deg$(V) >$ deg($Q↓1U - Q↓2V)$ for some $Q↓1$ and $Q↓2$,
show that the result of (a) can be improved to find a quotient
$Q$ such that $U = QV + R$, deg($R) <$ deg($V)$.\xskip (This is the
analog of (1) for string polynomials; part (a) showed that we
can make deg($R) <$ deg($U)$, under weaker hypotheses.)

c) A ``homogeneous'' polynomial is one in which
all terms have the same degree (length). If $U↓1, U↓2, V↓1,
V↓2$ are homogeneous string polynomials with $U↓1V↓1 = U↓2V↓2$
and deg($V↓1) ≥$ deg($V↓2)$, show that there is a homogeneous
string polynomial $U$ such that $U↓2 = U↓1U, V↓1 = UV↓2$.

d) Given that $U$ and $V$ are homogeneous string
polynomials with $UV = VU$, prove that there is a homogeneous
string polynomial $W$ such that $U = rW↑m, V = sW↑n$ for some
integers $m, n$ and rational numbers $r, s$. Give an algorithm
to compute such a $W$ having the largest possible degree. (This
algorithm is of interest, for example, when $U = α$ and $V =
β$ are strings satisfying $αβ = βα$; then $W$ is simply a string
$\gamma $. When $U = x↑m$ and $V = x↑n$, the solution of largest
degree is $W = x↑{$gcd($m,n)}$, so this algorithm includes a
gcd algorithm for integers as a special case.)

\exno 18. [M24] ({\sl Euclidean algorithm for string polynomials.})\xskip
Given string polynomials $V↓1, V↓2$ not both zero, which have
a ``common left multiple,'' i.e., for which there exist $U↓1,
U↓2$, not both zero, such that $U↓1V↓1 = U↓2V↓2$, the purpose
of this exercise is to find an algorithm to compute their ``greatest
common right divisor'' gcrd($V↓1, V↓2)$ as well as their ``least
common left multiple'' lclm($V↓1, V↓2)$. The latter quantities
are defined as follows: gcrd($V↓1, V↓2)$ is a common right divisor
of $V↓1$ and $V↓2$ (that is, $V↓1 = W↓1$ gcrd($V↓1, V↓2)$ and
$V↓2 = W↓2 gcrd(V↓1, V↓2)$ for some $W↓1, W↓2)$, and any common
right divisor of $V↓1$ and $V↓2$ is a right divisor of gcrd($V↓1,
V↓2)$; lclm($V↓1, V↓2) = Z↓1V↓1 = Z↓2V↓2$ for some $Z↓1, Z↓2)$,
and any common left multiple of $V↓1$ and $V↓2$ is a left multiple
of lclm($V↓1, V↓2)$.

For example, let $U↓1 = abbbab + abbab - bbab +
ab - 1, V↓1 = babab + abab + ab - b; U↓2 = abb + ab - b, V↓2
= babbabab + bababab + babab + abab - babb - 1$. Then we have
$U↓1V↓1 = U↓2V↓2 = abbbabbabab + abbabbabab + abbbababab + abbababab
- bbabbabab + abbbabab - bbababab + 2abbabab - abbbabb + ababab
- abbabb - bbabab - babab + bbabb - abb - ab + b$. For these
string polynomials it can be shown that gcrd($V↓1, V↓2) = ab
+ 1$, and lclm($V↓1, V↓2) = U↓1V↓1$.

The division algorithm of exercise 17 may be restated
thus: If $V↓1$ and $V↓2$ are string polynomials, with $V↓2 ≠
0$, and if $U↓1 ≠ 0$ and $U↓2$ satisfy the equation $U↓1V↓1
= U↓2V↓2$, then there exist string polynomials $Q$ and $R$ such
that $V↓1 = QV↓2 + R$, deg($R) <$ deg($V↓2). Note{\sl : }$It
follows readily that $Q$ and $R$ are uniquely determined, they
do not depend on $U↓1$ and $U↓2$; furthermore the result is
right-left symmetric in the sense that
$$$U↓2 = U↓1Q + R↑\prime \qquad$ where\qquad deg($R↑\prime )
=$ deg$(U↓1) $- deg$(V↓2) +$ deg($R) <$ deg$(U↓1)$
|qctr \6skip \quad Show that this division algorithm can be
extended to an algorithm that computes lclm($V↓1, V↓2)$ and
gcrd($V↓1, V↓2)$, and, in fact, it finds string polynomials
$Z↓1, Z↓2$ such that $Z↓1V↓1 + Z↓2V↓2 =$ gcrd($V↓1, V↓2)$.\xskip [{\sl Hint:
Use auxiliary variables $u↓1, u↓2, v↓1, v↓2, w↓1, w↓2, w↑\prime↓{1},
w↑\prime↓{2}, z↓1, z↓2, z↑\prime↓{1}, z↑\prime↓{2}$,
whose values are string polynomials; start by setting $u↓1 ←
U↓1, u↓2 ← U↓2, v↓1 ← V↓1, v↓2 ← V↓2$, and throughout the algorithm
maintain the conditions
$$$$
|qctr \hfill U$↓1w↓1 + U↓2w↓2 ⊗= u↓1,\hfill z↓1V↓1 + z↓2V↓2
⊗= v↓1,\cr
\4skip \hfill U↓1w↑\prime↓{1} + U↓2w↑\prime↓{2} ⊗= u↓2,\hfill
z↑\prime↓{1}V↓1 + z↑\prime↓{2}V↓2 ⊗= v↓2,\cr
\4skip \hfill u↓1z↓1 - u↓2z↑\prime↓{1} ⊗= (-1)↑nU↓1,\hfill
w↓1v↓1 - w↑\prime↓{1}v↓2 ⊗= (-1)↑nV↓1,\cr
\4skip \hfill -u↓1z↓2 + u↓2z↑\prime↓{2} ⊗= (-1)↑nU↓2,\hfill
|tab -w↓2v↓1 + w↑\prime↓{2}v↓2 ⊗= (-1)↑nV↓2\cr
\6skip$ at the $n$th iteration.]

\exno 19. [M39] ({\sl Common divisors of square matrices.})\xskip Exercise
18 shows that the concept of greatest common right divisor can
be meaningful when multiplication is not commutative. Prove
that any two $n \times n$ matrices $A$ and $B$ of integers have
a greatest common right matrix divisor $D$.\xskip [{\sl Suggestion:}
Design an algorithm whose inputs are $A$ and $B$, and whose
outputs are integer matrices $P, Q, X, Y$, where $A = PD, B
= QD, D = XA + YB.)$ Find a greatest common right divisor of
(${1\atop 3}$ ${2\atop 4}$) and $({4\atop 2}$ ${3\atop 1}$.]

\exno 20. [M40] What can be said about
calculation of the greatest common divisor of polynomials whose
coefficients are floating-point numbers? (Investigate the accuracy
of Euclid's algorithm.)

\exno 21. [M25] Assuming that $t↓3 = t↓4 =\cdots
= t↓k = 2 \biglp see (25)\bigrp , prove that the computation
time required by Algorithm C to compute the gcd of two n$th
degree polynomials over the integers is $O\biglp n↑4($log $Nn)↑2\bigrp
$, if the coefficients of the given polynomials are bounded
by $N$ in absolute value.

\exno 22. [M23] Prove Sturm's theorem   \xskip
[{\sl Hint:} Some sign sequence are impossible.]

\exno 23. [M22] Prove that if $u(x)$ in (28) has deg$(u)$ real
roots then deg($u↓{j+1}) =$ deg$(u↓j) - 1$ for $0 ≤ j ≤ k$.

|qleft \18skip |newtype |raiseordrop ??{\:r 4.6.2. Factorization
of Polynomials

\sleft }\6skip {\bf Factoring modulo p}. Let us now consider
the problem of {\sl factoring} polynomials not merely finding
the greatest common divisor of two or more of them. As in the
case of integer numbers (Sections 4.5.2, 4.5.4), the problem
of factoring seems to be more difficult than finding the greatest
common divisor; but factorization of polynomials modulo a prime
integer $p$ is not as hard to do as we might expect. It is much
easier to find the factors of an arbitrary polynomial of degree
$n$, modulo 2, than to use any known method to find the factors
of an arbitrary $n$-bit binary number. This surprising situation
is a consequence of an important factorization algorithm discovered
in 1967 by Elwyn R. Berlekamp [{\sl Bell System Technical J.
\bf 46} (1967), 1853--1859].

Let $p$ be a prime number; all arithmetic on polynomials
in the following discussion will be done modulo $p$. Suppose
that someone has given us a polynomial $u(x)$, whose coefficients
are chosen from the set \{0, 1, \ldotss , $p - 1\}$; we may assume
that $u(x)$ is monic. Our goal is to express $u(x)$ in the form
$$$u(x) = p↓1(x)↑e|infsup 1 \ldotss p↓r(x)↑e|infsup r,\eqno (1)$
|qctr \6skip where $p↓1(x), \ldotss , p↓r(x)$ are distinct, monic
irreducible polynomials.

As a first step, we can use a standard technique
to determine whether any of $e↓1, \ldotss$, $e↓r$ are greater
than unity. If
$$$u(x) = u↓nx↑n +\cdots + u↓0 = v(x)↑2w(x),\eqno
(2)$
|qctr \6skip then its ``derivative'' formed in the usual way
(but modulo $p)$ is
$$$u↑\prime (x) = nu↓nx↑{n-1} +\cdots + u↓1 = 2v(x)v↑\prime
(x)w(x) + v(x)↑2w↑\prime (x),\eqno (3)$
|qctr \6skip and this is a multiple of the squared factor $v(x)$.
Therefore our first step in factoring $u(x)$ is to form
$$gcd\biglp u(x), u↑\prime (x)\bigrp = d(x).\eqno (4)
|qctr β??????|newtype 58320---Computer Programming\quad (A.-W./Knu
%folio 539 galley 9 Almost total loss. (C) Addison-Wesley 1978	-
\def\\#1({\mathop{\hjust{#1}}(}
th)\quad ch 4\quad f. 539\quad g. 9


As a first step, we can use a standard technique
to determine whether any of the exponents $e↓1$, $\ldotss$, $e↓r$ are greater
than unity. If
$$u(x) = u↓nx↑n +\cdots + u↓0 = v(x)↑2w(x),\eqno(2)$$
then its ``derivative'' formed in the usual way
(but modulo $p$) is
$$u↑\prime (x) = nu↓nx↑{n-1} +\cdots + u↓1 = 2v(x)v↑\prime(x)w(x)+v(x)↑2w↑\prime(x),
\eqno (3)$$
and this is a multiple of the squared factor $v(x)$.
Therefore our first step in factoring $u(x)$ is to form
$$\gcd\biglp u(x), u↑\prime (x)\bigrp = d(x).\eqno (4)$$
If $d(x)$ is equal to 1, we know that $u(x)$
is ``squarefree,'' the product of distinct primes $p↓1(x) \ldotsm
p↓r(x)$. If $d(x)$ is not equal to 1 and $d(x) ≠ u(x)$, then
$d(x)$ is a proper factor of $u(x)$; so the process can be completed
by factoring $d(x)$ and $u(x)/d(x)$ separately. Finally, if
$d(x) = u(x)$, we must have $u↑\prime (x) = 0$; hence the coefficient
$u↓k$ of $x↑k$ is nonzero only when $k$ is a multiple of $p$.
This means that $u(x)$ can be written as a polynomial of the
form $v(x↑p)$, and in such a case we have
$$u(x) = v(x↑p) = \biglp v(x)\bigrp ↑p;\eqno (5)$$
the factorization process can be completed by finding
the irreducible factors of $v(x)$ and raising them to the $p$th
power.

Identity (5) may appear somewhat strange to the
reader, and it is an important fact that is basic to Berlekamp's
algorithm and that is easily proved: If $v↓1(x)$ and $v↓2(x)$
are any polynomials modulo $p$, then
$$
|qctr \hfill v(x)$↑p ⊗= (v↓mx↑m)↑p + (v↓{m-1}x↑{m-1})↑p +
\cdots + (v↓0)↑p\cr
\4skip ⊗= v↓mx↑{mp} + v↓{m-1}x↑{(m-1)p} +\cdots +
v↓0 = v(x↑p\6skip$ This proves (5).

The above remarks show that the problem of factoring
a polynomial reduces to the problem of factoring a squarefree
polynomial.
%folio 541 galley 10 (C) Addison-Wesley 1978	-
|newtype 58320---Computer Programming\quad (A.-W./Knuth)\quad
ch. 4\quad f. 541\quad g. 10

The solutions $v(x)$ to congruence (8) therefore
provide a key to the factorization of $u(x)$. It may seem harder
to find all solutions to (8) than to factor $u(x)$ in the first
place, but in fact this is not true, since the set of solutions
to (8) is closed under addition. Let deg$(u) = n$; we can construct
the $n \times n$ matrix
$$
|qctr ⊗q$↓{0,0}\quad ⊗q↓{0,1}\quad ⊗\cdots \quad
⊗q↓{0,n-1}\quad \cr
\3skip Q =⊗⊗⊗\cr
\3skip ⊗q↓{n-1,0}⊗q↓{n-1,1}⊗\cdots ⊗q↓{n-1,n-1}\cr
\6skip$ where
$$$x↑{pk} ≡ q↓{k,n-1}x↑{n-1} +\cdots + q↓{k,1}x +
q↓{k,0}\quad \biglp$ modulo $u(x)\bigrp .\eqno (12)$
|qctr \6skip |newtype Then $v(x) = v↓{n-1}x↑{n-1} +\cdots
+ v↓1x + v↓0$ is a solution to (8) if and only if
$$$(v↓0, v↓1, \ldotss , v↓{n-1})Q = (v↓0, v↓1, \ldotss , v↓{n-1}).\eqno
(13)$
|qctr \6skip For the latter equation holds if and only if
$$$v(x) = \sum ↓{j} v↓jx↑j = \sum ↓{j} \sum ↓{k} v↓kq↓{k,j}x↑j
≡ \sum ↓{k} v↓kx↑{pk} = v(x↑p \biglp$ modulo $u(x)\bigrp .\cr
\6skip$ \quad Berlekamp's factoring algorithm therefore proceeds
as follows:
$$|indent {\bf B1.} Ensure that $u(x)$ is squarefree; i.e.,
if gcd|newtype $(|newtype u(x), u↑\prime (x)\bigrp ≠ 1$, reduce
the problem of factoring $u(x)$, as stated earlier in this section.

NOT REALLY\algstep B2. Form the matrix $Q$ defined by (11) and (12). This
can be done in one of two ways, depending on whether or not
$p$ is very large, as explained below.

\algstep B3. ``Triangularize'' the matrix $Q - I$, where $I
= (\delta ↓{ij})$ is the $n \times n$ identity matrix, finding
its rank $n - r$ and finding linearly independent vectors $v↑{[1]},
\ldotss , v↑{[r]}$ such that $v↑{[j]}(Q - I) = (0, 0, \ldotss
, 0)$ for $1 ≤ j ≤ r$. (The first vector $v↑{[1]}$ may always
be taken as $(1, 0, \ldotss , 0)$, representing the trivial solution
$v↑{[1]}(x) = 1$ to (8). The ``triangularization'' needed in
this step can be done using appropriate colu¬mnnnn\3skip {\bf B3.}
``Triangularize'' the matrix $Q - I$, where $I = (\delta ↓{ij})$
is the $n \times n$ identity matrix, finding its rank $n - r$
and finding linearly independent vectors $v↑{[1]}, \ldotss$,
v↑{[r]}$ such that $v↑{[j]}(Q - I) = (0, 0, \ldotss , 0)$ for
$1 ≤ j ≤ r$. (The first vector $v↑{[1]}$ may always be taken
as $(1, 0, \ldotss , 0)$, representing the trivial solution $v↑{[1]}(x)
= 1$ to (8). The ``triangularization'' needed in this step can
be done using appropriate column operations, as explained in
Algorithm N below.) {\sl At this point, r is the number of irreducible
factors of u(x)}, because the solutions to (8) are the $p↑r$
polynomials corresponding to the vectors $t↓1v↑{[1]} +
\cdots + t↓rv↑{[r]}$ for all choices of integers $0 ≤ t↓1,
\ldotss, t↓r < p$. Therefore if $r = 1$ we know that $u(x)$
is irreducible, and the procedure terminates.

\algstep B4. Calculate gcd$\biglp u(x), v↑{[2]}(x) - s|newtype
) |newtype$ for 0 ≤ $s < p$, where $v↑{[2]}(x)$ is the polynomial
represented by vector $v↑{[2]}$. The result will be a nontrivial
factorization of $u(x)$, because $v↑{[2]}(x) - s$ is nonzero
and has degree less than deg($u)$, and by exercise 7 we have
$$$u(x) = \prod ↓{0≤s<p}$ gcd\biglp $v(x) - s, u(x)\bigrp \eqno
(14)$
|qctr \6skip \qquad whenever $v(x)$ satisfies (8).
|qleft \qquad \quad If the use of $v↑{[2]}(x)$ does not succeed
in splitting $u(x)$ into $r$ factors, further factors can be
obtained by calculating gcd\biglp $v↑{[k]}(x) - s, w(x)\bigrp$
for $0 ≤ s < p$ and all factors $w(x)$ found so far, for $k
= 3, 4, \ldotss $, until $r$ factors are obtained.\xskip (If we choose
$s↓i ≠ s↓j$ in (7), we obtain a solution $v(x)$ to (8) that
distinguishes $p↓i(x)$ from $p↓j(x)$; some $v↑{[k]}(x) - s$
will be divisible by $p↓i(x)$ and not by $p↓j(x)$, so this procedure
will eventually find all of the factors.)

|qleft \12skip |cancelindent \quad As an example of this procedure,
let us now determine the factorization of
$$$u(x) = x↑8 + x↑6 + 10x↑4 + 10x↑3 + 8x↑2 + 2x + 8\eqno (15)$
|qctr \6skip modulo 13. (This polynomial appears in several
of the examples in Section 4.6.1.) A quick calculation using
Algorithm 4.6.1E shows that gcd\biglp $u(x), u↑\prime (x)\bigrp
= 1$; therefore $u(x)$ is squarefree, and we turn to step B2.
Step B2 involves calculating the $Q$ matrix, which in this case
is an 8 \times 8 array. The first row of $Q$ is always $(1, 0,
0, \ldotss , 0)$, representing the polynomial $x↑0$ mod $u(x)
= 1$. The second row represents $x↑{13}$ mod $u(x)$, and, in
general, $x↑k$ mod $u(x)$ may readily be determined as follows
(for relatively small values of $k):$ If
|qleft $\6skip u(x) = x↑n + u↓{n-1}x↑{n-1} +\cdots
+ u↓1x + u↓$
0|qctr \6skip and if
$$$x↑k ≡ a↓{k,n-1}x↑{n-1} +\cdots + a↓{k,1}x + a↓{k,0}\quad
\biglp$ modulo $(u(x)\bigrp $,
|qctr \6skip then
$$x$↑{k+1} |tab ≡ a↓{k,n-1}x↑n +\cdots + a↓{k,1}x↑2
+ a↓{k,0}x⊗\4skip ⊗≡ a↓{k,n-1}(-u↓{n-1}x↑{n-1} -\cdots
- u↓1x - u↓0) + a↓{h,n-2}x↑{n-1} +\cdots +
a↓{k,0}x\cr
\4skip ⊗= a↓{k+1,n-1}x↑{n-1} +\cdots + a↓{k+1,1}x
+ a↓{k+1,0}$,
|qctr \6skip where
$$a$↓{k+1,j} = a↓{k,j-1} - a↓{k,n-1}u↓j.\eqno (16)$
|qctr \6skip In this formula $a↓{k,-1}$ is treated as zero,
so that $a↓{k+1,0} = -a↓{k,n-1}u↓0$. The simple ``shift register''
recurrence (16) makes it easy to calculate $x↑1, x↑2, x↑3, \ldots$
mod $u(x)$. Inside a computer, this calculation would of course
be done by keeping a one-dimensional array $(a↓{n-1}, \ldotss
, a↓1, a↓0)$ and repeatedly setting $t ← a↓{n-1}, a↓{n-1} ←
(a↓{n-2} - tu↓{n-1})$ mod $p, \ldotss$, $a↓1 ← (a↓0 - tu↓1)$ mod
$p, a↓0 ← (-tu↓0)$ mod $p$. (We have seen similar procedures
in connection with random-number generation; cf.\ 3.2.2--10.)
For our example polynomial $u(x)$ in (15), we obtain the following
sequence of coefficients of $x↑k$ mod $u(x)$, using arithmetic
modulo 13:
$$|newtype |tab \qquad \qquad \quad |tab \qquad \qquad \quad
|tab \qquad \qquad \quad |tab \qquad \qquad \quad |tab \qquad
\qquad \quad |tab \qquad \qquad \quad |tab \qquad \qquad \quad
|tab \qquad \qquad \quad |tab \qquad \qquad \quad |tab |zfa
⊗$k⊗a↓{k,7}⊗a↓{k,6}⊗a↓{k,5}⊗a↓{k,4}⊗a↓{k,3}⊗a↓{k,2}⊗a↓{k,1}⊗a↓{k,0}\cr
\3skip 0⊗ 0⊗ 0⊗ 0⊗ 0⊗ 0⊗ 0⊗ 0⊗ 1\cr
1⊗ 0⊗ 0⊗ 0⊗ 0⊗ 0⊗ 0⊗ 1⊗ 0\cr
2⊗ 0⊗ 0⊗ 0⊗ 0⊗ 0⊗ 1⊗ 0⊗ 0\cr
3⊗ 0⊗ 0⊗ 0⊗ 0⊗ 1⊗ 0⊗ 0⊗ 0\cr
4⊗ 0⊗ 0⊗ 0⊗ 1⊗ 0⊗ 0⊗ 0⊗ 0\cr
5⊗ 0⊗ 0⊗ 1⊗ 0⊗ 0⊗ 0⊗ 0⊗ 0\cr
6⊗ 0⊗ 1⊗ 0⊗ 0⊗ 0⊗ 0⊗ 0⊗ 0\cr
7⊗ 1⊗ 0⊗ 0⊗ 0⊗ 0⊗ 0⊗ 0⊗ 0\cr
8⊗ 0⊗12⊗ 0⊗ 3⊗ 3⊗ 5⊗11⊗ 5\cr
9⊗12⊗ 0⊗ 3⊗ 3⊗ 5⊗11⊗ 5⊗ 0\cr
10⊗ 0⊗ 4⊗ 3⊗ 2⊗ 8⊗ 0⊗ 2⊗ 8\cr
11⊗ 4⊗ 3⊗ 2⊗ 8⊗ 0⊗ 2⊗ 8⊗ 0\cr
12⊗ 3⊗11⊗ 8⊗12⊗ 1⊗ 2⊗ 5⊗ 7\cr
13⊗11⊗ 5⊗12⊗10⊗11⊗ 7⊗ 1⊗ 2\cr
\6skip |newtype$ Therefore the second row of $Q$ is (2, 1, 7,
11, 10, 12, 5, 11). Similarly we may determine $x↑{26}$ mod
$u(x), \ldotss , x↑{91}$ mod $u(x)$, and we find that
|qleft ????????????|newtype
%folio 543 galley 11 (C) Addison-Wesley 1978	-
\def\\#1({\mathop{\hjust{#1}}(}\def\Modulo#1){\penalty0\;\;\biglp\char'155
\char'157\char'144\char'165\char'154\char'157\,\,#1)\bigrp}
\12skip \quad The next step of Berlekamp's procedure requires
finding the ``null space'' of $Q - I$. In general, suppose that
$A$ is an $n \times n$ matrix over a field, whose rank $n -
r$ is to be determined; suppose further that we wish to determine
linearly independent vectors $v↑{[1]}, v↑{[2]}, \ldotss$, $v↑{[r]}$
such that $v↑{[1]}A = v↑{[2]}A = \cdots =v↑{[r]}A = (0, \ldotss
, 0)$. An algorithm for this calculation can be based on the
observation that any column of $A$ may be multiplied by a nonzero
quantity, and any multiple of one of its columns may be added
to a different column, without changing the rank or the vectors
$v↑{[1]}, \ldotss , v↑{[r]}$. (These transformations amount to
replacing $A$ by $AB$, where $B$ is a nonsigular matrix.) The
following well-known ``triangularization'' procedure may therefore
be used.

\algbegin Algorithm N (Null space algorithm).
Let $A = (a↓{ij})$ be an $n \times n$ matrix, whose elements
$a↓{ij}$ belong to a field and have subscripts in the range
$0 ≤ i, j < n$. This algorithm outputs $r$ vectors $v↑{[1]},
\ldotss , v↑{[r]}$, which are linearly independent over the field
and satisfy $v↑{[j]}A = (0, \ldotss , 0)$, where $n - r$ is the
rank of $A$.

\algstep N1. [Initialize.] Set $c↓0 ← c↓1
←\cdots ← c↓{n-1} ← -1, r ← 0$. (During the calculation
we will have $c↓j ≥ 0$ only if $a↓c|infinf j↓j = -1$ and all
other entries of row $c↓j$ are zero.)

\algstep N2. [Loop on {\sl k.]} Do step N3 for $k = 0$, 1, $\ldotss$,
$n - 1$, and then terminate the algorithm.

\algstep N3. [Scan row for dependence.] If there is some $j$
in the range $0 ≤ j < n$ such that $a↓{kj} ≠ 0$ and $c↓j < 0$,
then do the following: Multiply column $j$ of $A$ by -1/$a↓{kj}$
(so that $a↓{kj}$ becomes equal to -1); then add $a↓{ki}$ times
column $j$ to column $i$ for all $i ≠ j$; finally set $c↓j ←
k$. (Since it is not difficult to show that $a↓{sj} = 0$ for
all $s < k$, these operations have no effect on rows 0, 1, .
. , k - 1 of $A.)$
|qleft \qquad \quad On the other hand, if there is no $j$ in
the range $0 ≤ j < n$ such that $a↓{kj} ≠ 0$ and $c↓j < 0$,
then set $r ← r + 1$ and output the vector
$$$v↑{[r]} = (v↓0, v↓1, \ldotss , v↓{n-1})$
|qctr \6skip definded by the rule
$$
|qctr ⊗⊗a$↓{ks},⊗$if⊗$c↓s = j ≥ 0;\cr
\4skip ⊗v↓j =⊗1,⊗$if⊗$j = k;\eqno (18)\cr
\4skip ⊗⊗0,⊗$otherwise.\cr
\6skip |cancelindent \quad An example will reveal the mechanism
of this algorithm; let $A$ be the matrix $Q - I$ of (17) over
the field of integers modulo 13. When $k = 0$, we output the
vector $v↑{[1]} = (1, 0, 0, 0, 0, 0, 0, 0)$. When $k = 1$, we
may take $j$ in step N3 to be either 0, 2, 3, 4, 5, 6, or 7;
the choice here is completely arbitrary, although it affects
the particular vectors that are chosen to be output by the
algorithm. For hand calculation, it is most convenient to pick
$j = 5$, since $a↓{15} = 12 = -1$ already; the column operations
of step N3 then change $A$ to the matrix
$$ 0\quad 0\quad 0\quad 0\quad 0\quad 0\quad 0\quad 0
|qctr \4skip 0\quad 0\quad 0\quad 0\quad 0\quad 12\quad 0\quad
0
|qctr \4skip 11\quad 6\quad 5\quad 8\quad 1\quad 4\quad 1\quad
7
|qctr \4skip 3\quad 3\quad 9\quad 5\quad 9\quad 6\quad 6\quad
4
|qctr \4skip 4\quad 11\quad 2\quad 6\quad 12\quad 1\quad 8\quad
9
|qctr \4skip 5\quad 11\quad 11\quad 7\quad 10\quad 6\quad 1\quad
10
|qctr \4skip 1\quad 11\quad 6\quad 1\quad 6\quad 11\quad 9\quad
3
|qctr \4skip 12\quad 3\quad 11\quad 9\quad 6\quad 11\quad 12\quad
2
|qctr \6skip \6skip (The circled element in column ``5,'' row
``1,`` may be used to indicate that $c↓5 = 1$. Remember that
Algorithm N numbers the rows and columns of the matrix starting
with 0, not 1.) When $k = 2$, we may choose $j = 4$ and proceed
in a similar way, obtaining the following matrices, which all
have the same null space as $Q - I:$
$$??M30}
|qleft $k = 2⊗k = 3\cr
\3skip 0\quad 0\quad 0\quad 0\quad 0\quad 0\quad 0\quad 0⊗ 0\quad
0\quad 0\quad 0\quad 0\quad 0\quad 0\quad 0\cr
\2skip 0\quad 0\quad 0\quad 0\quad 0\quad 12\quad 0\quad 0⊗
0\quad 0\quad 0\quad 0\quad 0\quad 12\quad 0\quad 0\cr
\2skip 0\quad 0\quad 0\quad 0\quad 12\quad 0\quad 0\quad 0⊗
0\quad 0\quad 0\quad 0\quad 12\quad 0\quad 0\quad 0\cr
\2skip 8\quad 1\quad 3\quad 11\quad 4\quad 9\quad 10\quad 6⊗
0\quad 12\quad 0\quad 0\quad 0\quad 0\quad 0\quad 0\cr
\2skip 2\quad 4\quad 7\quad 1\quad 1\quad 5\quad 9\quad 3⊗ 9\quad
9\quad 8\quad 9\quad 11\quad 8\quad 8\quad 5\cr
\4skip 12\quad 3\quad 0\quad 5\quad 3\quad 5\quad 4\quad 5⊗
1\quad 10\quad 4\quad 11\quad 4\quad 4\quad 0\quad 0\cr
\2skip 0\quad 1\quad 2\quad 5\quad 7\quad 0\quad 3\quad 0⊗ 5\quad
12\quad 12\quad 7\quad 3\quad 4\quad 6\quad 7\cr
\2skip 11\quad 6\quad 7\quad 0\quad 7\quad 0\quad 6\quad 12⊗
2\quad 7\quad 2\quad 12\quad 9\quad 11\quad 11\quad 2\cr
\12skip k = 4⊗k = 5\cr
\3skip 0\quad 0\quad 0\quad 0\quad 0\quad 0\quad 0\quad 0⊗ 0\quad
0\quad 0\quad 0\quad 0\quad 0\quad 0\quad 0\cr
\2skip 0\quad 0\quad 0\quad 0\quad 0\quad 12\quad 0\quad 0⊗
0\quad 0\quad 0\quad 0\quad 0\quad 12\quad 0\quad 0\cr
\2skip 0\quad 0\quad 0\quad 0\quad 12\quad 0\quad 0\quad 0⊗
0\quad 0\quad 0\quad 0\quad 12\quad 0\quad 0\quad 0\cr
\2skip 0\quad 12\quad 0\quad 0\quad 0\quad 0\quad 0\quad 0⊗
0\quad 12\quad 0\quad 0\quad 0\quad 0\quad 0\quad 0\cr
\2skip 0\quad 0\quad 0\quad 0\quad 0\quad 0\quad 0\quad 12⊗
0\quad 0\quad 0\quad 0\quad 0\quad 0\quad 0\quad 12\cr
\2skip 1\quad 10\quad 4\quad 11\quad 4\quad 4\quad 0\quad 0⊗12\quad
0\quad 0\quad 0\quad 0\quad 0\quad 0\quad 0\cr
\2skip 8\quad 2\quad 6\quad 10\quad 11\quad 11\quad 0\quad 9⊗
5\quad 0\quad 0\quad 0\quad 5\quad 5\quad 0\quad 9\cr
\2skip 1\quad 6\quad 4\quad 11\quad 2\quad 0\quad 0\quad 10⊗12\quad
9\quad 0\quad 0\quad 11\quad 9\quad 0\quad 10\cr
\12skip ??M29}β?????$
%folio 545 galley 12 (C) Addison-Wesley 1978	-
H10L12M29}58320---Computer Programming\quad (A.-W./Knuth)\quad
ch. 4\quad f. 545\quad g. 12


We now go to step B4 of the factoring procedure.
Calculation of gcd\biglp $u(x), v↑{[2]}(x) - \bigrp$ for $s
= 0, 1, \ldotss , 12$, where $v↑{[2}](x) = x↑6 + 5x↑5 + 9x↑4
+ 5x↑2 + 5x$, gives the answer $x↑5 + 5x↑4 + 9x↑3 + 5x + 5$
when $s = 0, x↑3 + 8x↑2 + 4x + 12$ when $s = 2$, and unity for
other values of $s$. Therefore $v↑{[2]}(x)$ gives us only two
of the three factors. Turning to gcd\biglp $v↑{[3]}(x) - s,
x↑5 + 5x↑4 + 9x↑3 + 5x + 5\bigrp $, where $v↓3(x) = x↑7 + 12x↑5
+ 10x↑4 + 9x↑3 + 11x↑2 + 9x$, we obtain the value $x↑4 + 2x↑3
+ 3x↑2 + 4x + 6$ when $s = 6, x + 3$ when $s = 8$, and unity
otherwise. Thus the complete factorization is
$$$u(x) = (x↑4 + 2x↑3 + 3x↑2 + 4x + 6)(x↑3 + 8x↑2 + 4x + 12)(x
+ 3).\eqno (19)$
|qctr \6skip \quad Let us now estimate the running time of Berlekamp's
method when an $n$th degree polynomial is factored modulo $p$.
First assume that $p$ is relatively small, so that the four
arithmetic operations can be done modulo $p$ in essentially
a fixed length of time. (Division modulo $p$ can be converted
to multiplication, by storing a table of reciprocals; modulo
13, for example, we have ${1\over 2}$ = 7, ${1\over 3}$ = 9,
etc.) The computation in step B1 takes $O(n↑2)$ units of time;
step B2 takes $O(pn↑2)$. For step B3 we use Algorithm N\null, which
requires $O(n↑3)$ units of time at most. Finally, in step B4
we can observe that the calculation of gcd\biglp $f(x), g(x)\bigrp$
by Euclid's algorithm takes $O\biglp$ deg($f)$ deg$(g)\bigrp$
units of time; hence the calculation of gcd\biglp $v↑{[j]}(x)
- s, w(x)\bigrp$ for fized $j$ and $s$ and for all factors $w(x)$
of $u(x)$ found so far takes $O(n↑2)$ units. Step B4 therefore
requires $O(prn↑2)$ units of time at most. {\sl Berlekamp's
procedure factors an arbitrary polynomial of degree n}, modulo
$p, in O(n↑3 + prn↑2) steps$, when $p$ is a small prime; and
exercise 5 shows that the average number of factors, $r$, is
approximately ln $n$. Thus the algorithm is much faster than
any known methods of factoring $n$-digit numbers in the $p$-ary
number system.

When $p$ is large, a different implementation of
Berlekamp's procedure would be used for the calculations. Division
modulo $p$ would not be done with an auxiliary table of reciprocals;
instead the method of exercise 4.5.2--15, which takes $O\biglp
($log $p)↑2\bigrp$ steps, would probably be used. Then step
B1 would take $O\biglp n↑2($log $p)↑2\bigrp$ units of time;
similarly, step B3 takes $O\biglp n↑3($log $p)↑2\bigrp $. In
step B2, we can form $x↑p$ mod $u(x)$ in a more efficient way
than (16) when $p$ is large: Section 4 6 3 shows that this value
can essentially be obtained by using $O($log $p)$ operations
of ``squaring mod $u(x),$'' i.e., going from $x↑k$ mod $u(x)$
to $x↑{2k}$ mod $u(x)$. The squaring operation is relatively
easy to perform if we first make an auxiliary table of $x↑m$
mod $u(x)$ for $m = n, n + 1, \ldotss$, $2n -2$; if
$$$x↑kK44???\6skip x↑k$ mod u(x) = c$↓{n-1}x↑{n-1} +\cdots
+ c↓1x + c↓0$,
|qctr \6skip then
$$$x↑{2k}$ mod $u(x) = c↑{2}↓{n-1}x↑{2n-2} +\cdots
+ (c↓1c↓0 + c↓1c↓0)x + c|spose ↓0↑2$,
|qctr ???\6skip where $x↑{2n-2}, \ldotss$, $x↑n$ can be replaced
by polynomials in the auxiliary table. The net time to compute
$x↑p$ mod $u(x)$ comes to $O\biglp n↑2($log p)$↑3\bigrp units,
and we obtain the second row of Q$. To get further rows of $Q$,
we form $x↑{2p}$ mod $u(x), x↑{3p}$ mod $u(x), . . $. simply
by multiplying repeatedly by $x↑p$ mod $u(x)$, in a fashion
analogous to squaring mod $u(x)$; step B2 is completed in $O\biglp
n↑2($log $p)↑3 + n↑3($log $p)↑2\bigrp$ units of time. The same
upper bound applies to steps B1, B2, and B3 taken as a whole;
these three steps tell us the number of factors of $u(x)$.

But when $p$ is large and we get to step B4, we
are asked to calculate a greatest common divisor for $p$ different
values of $s$, and this is out of the question if $p$ is very
large. This hurdle was surmounted by Hans Zassenhaus [{\sl J.
Number Theory \bf 1} (1969), 291--311], who showed how to determine
all the ``useful'' values of $s$. For fixed $j$, let $w(x) =
??7 (x - s)$ where the product is over all $0 ≤ s < p$ such
that gcd\biglp $u(x), v↑{[j]}(x) - s\bigrp ≠ 1$. By (14), $w(x)$
is the polynomial of least degree such that $u(x)$ divides $w(v↑{[j]}(x))$.
Algorithm N can therefore be adapted to find the coefficients
of $w:$ Let $A$ be the $(r + 1) \times n$ matrix whose $↓$kth
row contains the coefficients of $?????????v↑{[j]}(x)↑k$ mod
$u(x)$, for $0 ≤ k ≤ r$. Apply the method of Algorithm N until
the first dependence is found in step N3; then the algorithm
terminates with $w(x) = v↓0 + v↓1x +\cdots + v↓kx↑k$,
where $v↓j$ is defined in (18). (An example is worked out below.)
At this point 2 ≤ $k ≤ r$; in rare circumstances we may have
$k = n$.

It remains to find the factors of $w(x)$ modulo
a large prime $p$, when $w$ is known to split into linear factors.
Suppose $w(x) = (x -s↓1) \ldotsm (x - s↓k)$; then it divides
$x(x - 1) \ldotsm (x - p + 1) = x↑p - x = x(x↑{(p-1)/2} - 1)(x↑{(p-1)/2}
+ 1)$, hence the identity
$$$w(x) =$ gcd\biglp $w(x), x - t) \cdot$ gcd\biglp $w(x), (x
- t)↑{(p-1)/2} - 1) \cdot$ gcd|newtype$ (|newtype w(x), (x -
t)↑{(p-1)/2} + 1)\eqno (20)$
|qctr \6skip holds for all integers $t$. Zassenhaus's procedure
for factoring $w$ is to try $t = 0, 1, 2, . . $. in (20) until
$w$ has been completely split. At first glance this may appear
to be an inefficient trial-and-error method, but actually it
finds the factors very rapidly, since the probability of a nontrivial
gcd in (20) is approximately 1 - 2$↑{-k}$ when $t$ is chosen
at random. The reason is that $x - s↓i$ divides $(x - t)↑{(p-1)/2}
$- if and only if $(s↓i - t)↑{(p-1)/2}$ mod $p = 1$, and this
occurs for about half of all values $t$.

For example, let's reconsider the polynomial $v↑{[3]}(x)
= x↑7 + 12x↑5 + 10x↑4 + 9x↑3 + 11x↑2 + 9x$ mentioned earlier,
and let's pretend that 13 is a large prime. Then
$$??M30}
|qctr Matrix $A⊗$is transformed into\cr
\3skip 1\quad 0\quad 0\quad 0\quad 0\quad 0\quad 0\quad 0⊗12\quad
0\quad 0\quad 0\quad 0\quad 0\quad 0\quad 0\cr
\2skip 0\quad 9\quad 11\quad 9\quad 10\quad 12\quad 0\quad 1⊗
0\quad 0\quad 0\quad 0\quad 0\quad 12\quad 0\quad 0\cr
\2skip 4\quad 4\quad 6\quad 9\quad 1\quad 7\quad 12\quad 1⊗
0\quad 0\quad 0\quad 0\quad 0\quad 0\quad 12\quad 0\cr
\2skip 4\quad 6\quad 3\quad 6\quad 11\quad 8\quad 0\quad 5⊗
9\quad 0\quad 0\quad 0\quad 0\quad 8\quad 0\quad 0\cr
\12skip ??M29}so we have $w(x) = 9 + 8x + x↑3$. Setting $t =
0$ in (20) produces the factor $x + 1 = x - 12$, and we replace
$w(x)$ by $w(x)/(x - 12) = x↑2 + 12x + 9$. When $t = 1$ we get
no further information, but $t = 2$ yields the other factors
$x - 6$ and $x - 8$. The useful values of $s$ with respect to
$v↑{[3]}(x)$ are 6, 8, and 12.

The average time to find the factors of $w(x)$
using (20) will be $O\biglp k↑3($log $p)↑2 + k↑2($log $p)↑3\bigrp
$, according to exercise 14. Since it takes us $O\biglp rn↑2($log
$p)↑2\bigrp$ steps to determine $w(x)$ from $v↑{[j]}(x)$, we
may conclude that step B4 can be accomplished in an average
time of $O\biglp r↑2n↑2($log $p)↑2 + r↑3($log $p)↑3\bigrp$ units,
for large primes $p$. In other words, step B4 is not the bottleneck
after all, when we use Zassenhaus's suggestions, since $r$ is
usually small compared to $n$.

For further discussion, see E. R. Berlekamp, {\sl
Math.\ Comp.\ \bf 24} (1970), 713--735.

\subsectionbegin{Distinct degree factorization} A
somewhat simpler method that often obtains factors modulo $p$
can be based on the fact that an irreducible polynomial $q(x)$
of degree $d$ is a divisor of $x↑p|supsup d - x$, and it is
not a divisor of $x↑p|supsup c - x$ for $c < d$; see exercise
16. We proceed as follows:\12skip \12skip |indent {
\bf S1.}
Rule out squared factors and find the matrix $Q$, as in Berlekamp's
method. Also set $v(x) ← u(x), w(x) ← ``x$'', and $d ← 0$. (Here
$v(x)$ and $w(x)$ are variables that have polynomials as values.)

NOT REALLY\algstep S2. Increase $d$ by 1 and replace $w(x)$ by $w(x)↑p$
mod $u(x)$. (In other words, the coefficients $(w↓0, \ldotss
, w↓{n-1})$ are replaced by $(w↓0, \ldotss , w↓{n-1})Q$. At this
point $w(x) = x↑p|supsup d$ mod $u(x).\bigrp$

 \algstep S4. Find $g(x) =$ gcd\biglp $w(x) - x, v(x)\bigrp $.
(This is the product of all the irreducible factors of $u(x)$
whose degree is $d.)$ Replace $v(x)$ by $v(x)/g(x)$.

\algstep S4. If $d < {1\over 2}$ deg($v)$, return to S2. Otherwise
$v(x)$ is irreducible, and the procedure terminates.
|qleft ?????????|newtype 58320---Com
%folio 550 galley 13 (C) Addison-Wesley 1978	-
\yyskip This procedure determines the product of
all irreducible factors of each degree $d$, and therefore it
tells us how many factors there are of each degree. Since the
three factors of our example polynomial (19) have different
degrees, they would all be discovered. The total running time,
analyzed as above, is $O\biglp n↑3(\log p)↑2 + n↑2(\log p)↑3\bigrp$.

The distinct degree factorization technique was
known to several people in 1960 [cf.\ S. W. Golumb, L. R. Welch,
A. Hales, ``On the factorization of trinomials over GF(2),''
Jet Propulsion Laboratory memo 20--189 (July 14, 1959)], but
there seem to be no references to it in the ``open literature.''
Previous work by \v S. Schwarz, {\sl Quart.\ J. Math.}, Oxford (2)
{\bf 7} (1956), 110--124, had shown how to determine the number
of irreducible factors of each degree, but not their product,
using the matrix $Q$.

If the distinct degree factorization procedure doesn't find a complete
factorization, it is not necessary to abandon hope: There is still a good
chance that the factors can be found by computing $\gcd\biglp g(x),\,
(x+t)↑{(p↑d-1)/2}-1\bigrp$ for $t=0$, 1, $\ldotss$, $p-1$, when $p$ is
odd and all irreducible factors of $g(x)$ have degree $d$. Every divisor of
$x↑{p↑d}-x$ also divides $(x+t)↑{p↑d}-(x+t)=(x+t)\biglp(x+t)↑{(p↑d-1)/2}-1\bigrp
\biglp(x+t)↑{(p↑d-1)/2}+1\bigrp$, so we can argue as in (20). If $d=1$, all
the linear factors will be found quickly, even when $p$ is large. If $d>1$,
we might not split $g(x)$ completely, but the prospects are good, {\sl especially}
if $p$ is large.

For example, there are eight irreducible polynomials $f(x)$ of degree 3, modulo
3, and they will all be distinguished by this procedure:
$$\vjust{\halign{$\hfill#\null$⊗$\hfill#\null$⊗$\hfill#$\qquad⊗$\ctr{#}$⊗\qquad
$\ctr{#}$⊗\qquad$\ctr{#}$\cr
f(x)⊗⊗⊗\gcd\biglp f(x),x↑{13}-1\bigrp⊗\gcd\biglp f(x),(x+1)↑{13}-1\bigrp⊗
\gcd\biglp f(x),(x+2)↑{13}-1\bigrp\cr
\noalign{\vskip2pt}
x↑3+⊗⊗2x+1⊗1⊗1⊗1\cr
x↑3+⊗⊗2x+2⊗f(x)⊗f(x)⊗f(x)\cr
x↑3+⊗x↑2+⊗2⊗f(x)⊗f(x)⊗1\cr
x↑3+⊗x↑2+⊗x+2⊗f(x)⊗1⊗f(x)\cr
x↑3+⊗x↑2+⊗2x+1⊗1⊗f(x)⊗f(x)\cr
x↑3+⊗2x↑2+⊗1⊗1⊗f(x)⊗1\cr
x↑3+⊗2x↑2+⊗x+1⊗1⊗1⊗f(x)\cr
x↑3+⊗2x↑2+⊗2x+x⊗f(x)⊗1⊗1\cr}}$$
On the other hand, when the number of irreducible polynomials of degree $d$
exceeds $2↑p$, it is clear that there will exist irreducibles that cannot
be distinguished by this method.\xskip M. O. Rabin has shown how to extend
the distinct-degree factorization technique to a complete factorization in
all cases, by doing arithmetic in a field with $p↑n$ elements (see exercise
31).

\subsectionbegin{Factoring over the integers} It
is somewhat more difficult to find the complete factorization
of polynomials with integer coefficients when we are {\sl not}
working modulo $p$, but some reasonably efficient methods are
available for this purpose.

Isaac Newton gave a method for finding linear and
quadratic factors of polynomials with integer coefficients in
his {\sl Arithmetica Universalis\/} (1707). This method was extended
by an astronomer named Friedrich von Schubert in 1793, who showed
how to find all factors of degree $n$ in a finite number of
steps; see M. Cantor, {\sl Geschichte der Mathematik \bf 4}
(Leipzig: Teubner, 1908), 136--137.\xskip L. Kronecker rediscovered
von Schubert's method independently about 90 years later; but
unfortunately the method is very inefficient when $n$ is five
or more. Much better results can be obtained with the help of
the ``mod $p$'' factorization methods presented above.

Suppose that we want to find the irreducible factors
of a given polynomial
$$u(x) = u↓nx↑n + u↓{n-1}x↑{n-1} +\cdots + u↓0,\qquadu↓n ≠ 0,$$
over the integers. As a first step, we can divide
by the greatest common divisor of the coefficients, and this
leaves us with a {\sl primitive} polynomial. We may also assume
that $u(x)$ is squarefree, by dividing out $\gcd\biglp u(x),
u↑\prime (x)\bigrp$ as above.

Now if $u(x) = v(x)w(x)$, where all of these polynomials
have integer coefficients, we obviously have $u(x) ≡ v(x)w(x)
\modulo p$ for all primes $p$, so there is a nontrivial factorization
modulo $p$ unless $p$ divides $\lscr(u)$. Berlekamp's efficient
algorithm for factoring $u(x)$ modulo $p$ can therefore be used
in an attempt to reconstruct possible factorizations of $u(x)$
over the integers.

For example, let
$$u(x) = x↑8 + x↑6 - 3x↑4 - 3x↑3 + 8x↑2 + 2x - 5.\eqno (21)$$
We have seen above in (19) that
$$u(x) ≡ (x↑4 + 2x↑3 + 3x↑2 + 4x + 6)(x↑3
+ 8x↑2 + 4x + 12)(x + 3)\modulo{13};\eqno(22)$$
and the complete factorization of $u(x)$ modulo 2 shows
one factor of degree 6 and another of degree 2 (see exercise
10). From (22) we can see that $u(x)$ has no factor of degree
2, so it must be irreducible over the integers.

This particular example was perhaps too simple;
experience shows that most irreducible polynomials can be recognized
as such by examining their factors modulo a few primes, but
it is {\sl not\/} always so easy to establish irreducibility. For example,
there are polynomials that 
can be properly factored modulo $p$ for all primes
$p$, with consistent degrees of the factors, yet they are irreducible
over the integers (see exercise 12).

Almost all polynomials are irreducible over the
integers, as shown in exercise 27. But we usually aren't trying
to factor a random polynomial; there is probably some reason
to expect a nontrivial factor or else the calculation would
not have been attempted in the first place. We need a method
that identifies factors when they are there.

In general if we try to find the factors of $u(x)$
by considering its behavior modulo different primes, the results will not be
easy to combine the results; for example, if $u(x)$
actually is the product of four quadratic polynomials, it will
be hard to match up their images with respect to different prime
moduli. Therefore it is desirable to stick to a single prime and
to see how much mileage we can get out of it, once we feel that the
factors modulo this prime have the right degrees.

One idea is to work modulo a very {\sl large} prime
$p$, big enough so that the coefficients in any true factorization
$u(x) = v(x)w(x)$ over the integers must actually lie between
$-p/2$ and $p/2$. Then all possible integer factors can be ``read
off'' from the mod $p$ factors we know how to compute.

Exercise 20 shows how to obtain fairly good bounds
on the coefficients of polynomial factors. For example, if (21)
were reducible it would have a factor $v(x)$ of degree $≤4$, and
the coefficients of $v$ would be at most 67 in magnitude by
the results of that exercise. So all potential factors of $u(x)$
will be fairly evident if we work modulo any prime $p > 134$.
Indeed, the complete factorization modulo 137 is
$$(x↑3 + 32x↑2 + 21x + 56)(x↑5 - 32x↑4 + 45x↑3 - 2x↑2 - 51x - 27),$$
and we see immediately that $x↑3 +\cdots
+ 56$ is not a factor since 56 doesn't divide 5.\xskip $\biglp$Incidentally,
it is not trivial to obtain good bounds on the coefficients
of polynomial factors, since a lot of cancellation can occur
when polynomials are multiplied. For example, the innocuous-looking
polynomial $x↑n - 1$ has irreducible factors whose coefficients
exceed $\exp(n↑{1/\!\lg\lg n}) for infinitely many $n$.\xskip
[See R. C. Vaughan, {\sl Michigan Math.\ J. \bf 21} (1974), 289--295].\xskip
The factorization of $x↑n-1$ is discussed in exercise 32.$\bigrp$

Instead of using a large prime $p$, which might
have to be truly enormous if $u(x)$ has large degree or large
coefficients, we can also make use of small $p$, provided that
$u(x)$ is squarefree mod $p$. For in this case, an important
construction introduced by K. Hensel [{\sl Theorie der Algebraischen
Zahlen (}Leipzig: Teubner, 1908), Chapter 4] can be used to
extend a factorization modulo $p$ in a unique way to a factorization
modulo $p↑e$ for arbitrarily high $e$. Hensel's method is described
in exercise 22; if we apply it to (22) with $p = 13$ and $e
= 2$, we obtain the unique factorization $u(x) ≡ (x - 36)(x↑3
- 18x↑2 + 82x - 66)(x↑4 + 54x↑3 - 10x↑2 + 69x + 84) (modulo
169) = v↓1(x)v↓3(x)v↓4(x)$, say. Clearly $v↓1(x)$ and $v↓3(x)$
are not factors of $u(x)$ over the integers; and neither is
their product $v↓1(x)v↓3(x)$ when the coefficients have been
reduced modulo 169 to the range ($-{169\over 2}$, ${169\over
2}$). Thus we have exhausted afifi possibilities, proving once
again that $u(x)$ is irreducible over the integers---this time
using only its factorization modulo 13.

The example we have been considering is atypical
in one important respect: We have been factoring the {\sl monic}
polynomial $u(x)$ in (21), so we could assume that all its factors
were monic. What should we do if $u↓n > 1↑2$. In such a case,
the leading coefficients of all but one of the polynomial factors
can be varied almost arbitrarily modulo $p↑e$; we certainly
don't want to try all possibilities. Perhaps the reader has
alrady noticed this problem. Fortunately there is a simple way
out: the factorization $u(x) = v(x)w(x)$ implies a factorization
$u↓nu(x) = v↓1(x)w↓1(x)$ where $??9(v↓1) = ??9(w↓1) = u↓n =
??9(u)$. (``Do you mind if I multiply your polynomial by its
leading coefficient before factoring it?'') We can proceed esentially
as above, but using $p↑e ≥ 2B$ where $B$ now bounds the maximum
coefficient for factors of $u↓n(x)$ instead of $u(x)$.

|qleft \6skip |indent {\bf F1.} Find the unique squarefree factorization
$u(x) ≡ ??9(u)v↓1(x) O \cdots v↓r(x)$ )modulo $p↑e)$, where
the $v↓j(x)$ are monic. (This will be possible for all but a
few primes $p$, see exercise 23.) Set $d ← 1$.

NOT REALLY\algstep F2. For every combination of factors $v(x) = v↓i|infinf
1(x) \ldotsm v↓i|infinf k(x)$ with deg$(v) = d$, and with $i↓1
= 1$ if $d = {1\over 2}$ deg($u)$, form the unique polynomial
$v(x) ≡ ??9(u)v(x)$ (modulo $p↑e)$ whose coefficients all lie
in the interval $[-{1\over 2}p↑e, {1\over 2}p↑e)$. If $v(x)$
divides $??9(u)u(x)$, output the factor pp\biglp $v(x)\bigrp
$, divide $u(x)$ by this factor, and remove the corresponding
$v↓i(x)$ from the list of factors modulo $p↑e$; terminate the
algorithm if deg($u) < 2d$.

\algstep F3. Increase $d$ by 1, and return to F2 if $d ≤ {1\over
2}$ deg $(u)$.
|qleft |cancelindent \12skip At the conclusion of this process,
$u(x)$ will be the final irreducible factor of the originally
given polynomial. Note that if $|u↓0| < |u↓n|$, it is preferable
to work with the reverse polynomial $u↓0x↑n +\cdots
+ u↓n$, whose factors are the reverses of the factors of $u(x)$.

The above algorithm contains an obvious bottleneck:
We may have to test as many as 2$↑{r-1}$ potential factors $v(x)$.
The average value of 2$↑r$ in a random situation is about $n$,
or perhaps $n↑{1.5}$ (see exercise 5), but in nonrandom situations
we will want to speed up this part of the routine as much as
we can. One way to rule out spurious factors quickly is to compute
the trailing coefficient $(0)$ first, continuing onlu if this
divides $??9(u)u(0)$.

Another important way to speed up the procedure
is to reduce $r$ so that it tends to reflect the true number
of factors. The distinct degree factorization algorithm above
can be applied for various small primes $p↓j$, thus obtaining
for each prime a set $D↓j$ of possible degrees of factors modulo
$p↓j$; see exercise 00. We can represent $D↓j$ as a string of
$n$ binary bits. Now we compute the intersection ∩$D↓j$, namely
the logical ``and'' of these bit strings, and we perform step
F2 only for $d \in ∩D↓j$. Furthermore $p$ is selected as that
$p↓j$ having the smallest value of $r$. This technique is due
to David R. Musser, whose experience suggests trying about five
primes $p↓j$; of course we would stop immediately if the current
$∩D↓j$ shows that $u(x)$ is irreducible. Musser has given a
complete discussion of the above factorization procedure in
{\sl JACM \bf 22} (1975), 291--308.
|qleft ?????????|newtype 58320
%folio 558 galley 14 (C) Addison-Wesley 1978	-
\subsectionbegin{Greatest common divisors} Similar techniques
can be used to calculate greatest common divisors of polynomials:
If $\gcd\biglp u(x), v(x)\bigrp = d(x)$ over the integers, and
if $\gcd\biglp u(x), v(x)\bigrp = q(x)\modulo p$, where $q(x)$
is monic, then $d(x)$ is a common divisor of $u(x)$ and $v(x)$
modulo $p$; hence
$$d(x)\quad\hjust{divides}\quad q(x)\quad\modulo p.\eqno(23)$$
If $p$ does not divide the leading coefficients
of both $u$ and $v$, it does not divide the leading coefficient
of $d$; in such a case $\hjust{deg}(d) ≤\hjust{deg}(q)$. When $q(x) = 1$
for such a prime $p$, we must therefore have deg$(d) = 0$, and
$d(x) =\gcd\biglp\hjust{cont}(u), \hjust{cont}(v)\bigrp $. This justifies
the remark made in Section 4.6.1 that the simple computation
(4.6.1--6) of $\gcd\biglp u(x), v(x)\bigrp$ modulo 13 is enough
to prove that $u(x)$ and $v(x)$ are relatively prime over the
integers; the comparatively laborious calculations of Algorithm
4.6.1E or Algorithm 4.6.1C are unnecessary. Since two random
primitive polynomials are almost always relatively prime over
the integers, and since they are relatively prime modulo $p$
with probability $1 - 1/p$, it is usually a good idea to do
the computations modulo $p$.

As remarked above, we need good methods also for
the nonrandom polynomials that arise in practice. Therefore we wish to
sharpen our techniques and discover how to find $\gcd\biglp u(x),v(x)\bigrp$
in general, over the integers, based entirely on information that we obtain
working modulo primes $p$. We may assume that $u(x)$ and $v(x)$ are
primitive.

Instead of calculating $\gcd\biglp u(x), v(x)\bigrp$ directly, it will
be convenient to search instead for the polynomial
$$\=d(x) = c \cdot\gcd\biglp $u(x), v(x)\bigrp ,\eqno(24)$$
where the constant $c$ is chosen so that
$$\lscr(\=d) =\gcd\biglp\lscr(u),\lscr(v)\bigrp .\eqno(25)$$
This condition will always hold for suitable $c$,
since the leading coefficient of any common divisor of $u(x)$
and $v(x)$ must be a divisor of gcd$\biglp ??9(u), ??9(v)\bigrp
$. Once $|spose \pm d(x)$ has been found satisfying these conditions,
we can readily compute pp\biglp $\pm d(x)\bigrp $, which is
the true greatest common divisor of $u(x)$ and $v(x)$. condition
(25) is convenient since it avoids the uncertainty of unit multiples
of the gcd; it is essentially the idea we used to control leading
coefficients in our factorization routine.

If $p$ is a sufficiently large prime, based on
the bounds for coefficients in exercise 20 applied either to
$??9(|spose ffd)u(x)$ or $??9(|spose ffd)v(x)$, let us compute
the unique polynomial $|spose 3q(x) ≡ ??9(|spose ffd)q(x)\modulo
p$ having all coefficients in $[-{1\over 2}p, {1\over 2}p)$.
When pp\biglp $|spose 3q(x)\bigrp$ divides both $u(x)$ and $v(x)$,
it must equal gcd\biglp $u(x), v(x)\bigrp$ because of (23).
On the other hand if it does not divide both $u(x)$ and $v(x)$
we must have deg($q) >$ deg$(d)$. A study of Algorithm 4.6.1E
reveals that this will be the case only if $p$ divides the leading
coefficient of one of the nonzero remainders computed by that
algorithm with exact integer arithmetic; otherwise Euclid's
algorithm modulo $p$ deals with precisely the same sequence
of polynomials as Algorithm 4.6.1E except for nonzero constant
multiples (modulo $p)$. So only a small number of ``unlucky''
primes can cause us to miss the gcd, and we will soon find it
if we keep trying.

If the bound on coefficients is so large that single-precision
primes $p$ are insufficient, we can compute $|spose ffd(x)$
modulo several primes $p$ until it has been determined via the
Chinese remainder algorithm in Section 4.3.2. This approach,
which is due to W. S. Brown and G. E. Collins, has been described
in detail by Brown in {\sl JACM \bf 18} (1971), 478--504. Alternatively,
as suggested by J. Moses and D. Y. Y. Yun [{\sl Proc.\ ACM Conf.\
\bf 28} (1973), 159--166], we can use Hensel's Lemma to determine
$|spose ffd(x)$ modulo $p↑e$ for sufficiently large $e$. Hensel's
construction is valid directly only when
$$gcd$\biglp d(x), u(x)/d(x)\bigrp = 1\qquad$ or\qquad gcd\biglp
$d(x), v(x)/d(x)\bigrp = 1,\eqno (26)$
|qctr \6skip since the idea is to apply the techniques of exercise
22 to one of the factorizations $??9(|spose ffd)u(x) ≡ |spose
3q(x)u↓1(x)$ or $??9(|spose ffd)v(x) ≡ |spose 3q(x)v↓1(x)\modulo
p$. In the comparatively rare cases when (26) fails, we can
still find the gcd by casting out squared factors in an appropriate
manner, as shown in exercise 29. The complete procedure has
been discussed by Miola and Yun in {\sl SIGSAM Bulletin \bf 8},
3(August 1974), 46--54; it appears to be computationally superior
to the Chinese remainder approach.

The gcd algorithms sketched here are significantly
faster than those of Section 4.6.1 except when the polynomial
remainder sequence is very short. Perhaps the best combined
approach would be to start with the computation of gcd\biglp
u(x), v(x)\bigrp mpdulo a fairly small prime $p$, not a divisor
of both $??9(u)$ and $??9(v)$. If the result $q(x)$ is 1, we're
done; if it has high degree, we can use Algorithm 4.6.1C\null; otherwise
we use one of the above methods, first computing a bound for
the coefficients of $|spose ffd(x)$ based on the coefficients
of $u(x), v(x)$, and the (small) degree of $q(x)$. As in the
factorization problem, we should apply this procedure to the
reverses of $u(x), v(x)$ and reverse the result, if the trailing
coefficients are simpler than the leading ones.

\subsectionbegin{Multivariate polynomials} Similar
techniques lead to useful afigorithms for factorization or gcd
calculations on multivariate polynomials with integer coefficients.
Such a polynomial $u(x↓1, \ldotss , x↓t)$ can be dealt with modulo
the irreducible polynomials $x↓2 - a↓2, \ldotss$, $x↓t - a↓t$,
yielding the univariate polynomial $u(x↓1, a↓2, \ldotss , a↓t
- a↓t$, yielding the univariate polynomial $u(x↓1, a↓2, \ldotss
, a↓t)$; these irreducible polynomials play the role of $p$
in the above discussion. (Note that $v(x)$ mod $(x - a)$ is
$v(a).|newtype ) |newtype$ When the integers $a↓2, \ldotss$,
a↓t$ have been chosen so that $u(x↓1, a↓2, \ldotss , a↓t)$ has
the same degree in $x↓1$ as $u(x↓1, x↓2, \ldotss , x↓t)$, an
appropriate generalization of Hensel's Lemma will ``lift'' squarefree
factorizations of this univariate polynomial to factorizations
modulo $(x↓2 - a↓2)↑n|infsup , \ldotss$, $(x↓t - a↓t)↑n|infsup
t$, where $n↓j$ is the degree of $x↓j$ in $u$; at the same time
we work also modulo an apprppriate integer prime $p$. As many
as possible of the $a↓j$ should be zero, so that sparseness
of the intermediate results is retained. For details, see P.
S. Wang and L. P. Rothschild, {\sl Math.\ Comp.\ \bf 29} (1975),
935--950, in addition to the papers by Musser and by Moses and
Yun cited earlier.

|qleft \24skip {\:r EXERCISES

\exno 1. [M24] Let $p$ be prime.
What is the probability that a random polynomial of degree $n$
has a linear factor (a factor of degree 1), when $n ≥ p?$ (Assume
that each of the $p↑n$ monic polynomials modulo $p$ is equally
probable.)

\exno 2. [M25] (a) Show that any monic polynomial $u(x)$, over
a unique factorization domain, may be expressed uniquely in
the form
$$$u(x) = v(x)↑2w(x)$,
|qctr \6skip where $w(x)$ is squarefree (has no factor of positive
degree of the form $d(x)↑2\bigrp$ and both $v(x)$ abd $w(x)$
are monic.

|qleft |newtype \qquad b) (E. R. Berlekamp.)\xskip How many monic
polynomials of degree $n$ are squarefree modulo $p$, when $p$
is prime?

\exno 3. [M25] Let $u↓1(x), \ldotss$, $u↓r(x)$ be polynomials
over a field $S$, with $u↓j(x)$ relatively prime to $u↓k(x)$
for all $j ≠ k$. For any given polynomials $w↓1(x), \ldotss$,
w↓r(x)$ over $S$, prove that there is a unique polynomial $v(x)$
over $S$ such that
$$deg$(v) <$ deg$(u↓1) +\cdots +$ deg($u↓r)$
|qctr \6skip and
|qleft $v(x) ≡ w↓j(x)\qquad \biglp$ modulo $u↓j(x)|newtype )$
|qctr |newtype \6skip for $1 \mathop{∩} j \mathop{∩} r$. (Compare
with Theorem 4.3.2C.)

\exno 4. [HM28] Let $a↓{np}$ be the number of monic irreducible
polynomials of degree $n$, modulo a prime $p$. Find a formula
for the generating function $G↓p(z) = \sum ↓n a↓{np}z↑n$.\xskip [{\sl Hint:
Prove the following identity connecting power series: $f(z)
= \sum ↓{j≥1} \mu (n)f(z↑n)/n↑t$.]\xskip What is lim$↓{p\∞}a↓{np}/p↑n?$

\exno 5. [HM30] Let $A↓{np}$ be the average number of factors
of a randomly selected polynomial of degree $n$, modulo a prime
$p$. Show that
$$lim$↓{p\∞} A↓{np} = H↓n$.
|qctr \6skip What is the limiting average value of 2$↑r$, when
there are $r$ factors?

\exno 6. [M21] (J. L. Lagrange, 1771.)\xskip Prove the congruence
(9).\xskip [{\sl Hint:} Factor $x↑p - x$ in the field of $p$
elements.]

\exno 7. [M22] Prove Eq.\ (14).

\exno 8. [HM20] How can we be sure that the vectors output by
Algorithm N are linearly independent?

\exno 9. [20] Explain how to construct a table of reciprocals
mod 101 in a simple way, given that 2 is a primitive root of
101.

\exno 10. [21] Find the complete factorization of the polynomial
$u(x)$ in (21), modulo 2, using Berlekamp's procedure.

\exno 11. [22] Find the complete factorization of the polynomial
$u(x)$ in (21), modulo 5.

\exno 12. [M22] Use Berlekamp's algorithm to determine the number
of factors of $u(x) = x↑4 + 1$ modulo $p$, for all primes {\sl
p$.\xskip [{\sl Hint:} Consider the cases $p = 2$, $p = 8k + 1$, $p =
8k + 3$, $p = 8k + 5$  $p = 8k + 7$ separately; what is the matrix
$Q$? You need not discover the factors; just determine how many
there are.]

\exno 13. [M25] Give an explicit formula for the factors of
$x↑4 + 1$, modulo $p$, for all primes $p$, in terms of the quantities
??U19}-1??, ??U19}2, ??U19}-2?? (if such quantities exist modulo
$p)$.

\exno 14. [M30] Assuming that each
linear factor $x - s↓i$ of $w(x)$ has probability ${1\over 2}$
of dividing $(x - t)↑{(p-1)/2} - 1$, for each $t$, independent
of the other $s↓j$ and the behavior for other values of $t$,
estimate the running time needed to factor $w(x)$ completely
using (20).

\exno 15. [M27] Design an algorithm to calculate the ``square
root'' of a given integer $u$ modulo a given prime $p$, i.e.,
to find an integer $U$ such that $U↑2 ≡ u$ (modulo $p)$ whenever
such a $U$ exists. Your algorithm should be efficient even for
very large primes $p$. (A solution to this problem leads to
a procedure for solving any quadratic equation modulo $p$, using
the quadratic formula in the usual way.)

\exno 16. [M30] Given that $f(x)$ is an irreducible polynomial
modulo a prime $p$, of degree $n$, prove that the $p↑n$ polynomials
of degree less than $n$ form a field under arithmetic modulo
$f(x)$ and $p. (Note{\sl : }$The existence of irreducible polynomials
of each degree is proved in exercise 4; therefore fields with
$p↑n$ elements for all primes $p$ and all $n ≥ 1.)$
|qleft β|newtype 58
%folio 568 galley 15 (C) Addison-Wesley 1978	-
320---Computer Programming\quad (A.-W./Knuth)\quad ch. 4\quad
f. 568\quad g. 14

|qleft \12skip |newtype \qquad b) Show that any field with $p↑n$
elements has ``primitive root`` element $\xi$ such that the
elements of the field are \{0, 1, $\xi , \xi ↑2, \ldotss , \xi
↑p|supsup p↑{-2}\}$.
|qleft \qquad c) If $f(x)$ is an irreducible polynomial modulo
$p$, of degree $n$, prove that $x↑p|supsup m - x$ is divisible
by $f(x)$ if and only if $m$ is a multiple of $n$.

\exno 17. [M23] Let $F$ be a field with 13$↑2 elements. How
many elements of F$ have order $f$, for each integer $f$ with
$1 ≤ f < 13↑2?$ (The ``order'' of an element $a$ is the least
positive integer $m$ such that $a↑m = 1.)$

\trexno 18. [M25] Let $u(x) = u↓nx↑n +\cdots + u↓0,
u↓n ≠ 0$, be a primitive polynomial with integer coefficients,
and let $v(x)$ be the monic polynomial defined by
$$$v(x) = u↑{n-1}↓{n} \cdot u(x/u↓n) = x↑n + u↓{n-1}x↑{n-1}
+ u↓{n-2}u↓nx↑{n-2} +\cdots u↓0u↑{n-1}↓{n}$.\xskip
(a) Given that $v(x)$ has the complete factorization
$p↓1(x) \ldotsm p↓r(x)$ over the integers, where each $p↓j(x)$
is monic, what is the complete factorization of $u(x)$ over
the integers?\xskip (b) If $w(x) = x↑m + w↓{m-1}x↑{m-1} +\cdots
+ w↓0$ is a factor of $v(x)$, prove that $w↓k$ is a multiple
of $u↑{m-1-k}↓{n}$ for $0 ≤ k < m$.

\exno 19. [M20] ({\sl Eisenstein's criterion.})\xskip Perhaps the best-known
class of irreducible polynomials over the integers was introduced
by G. Eisenstein in {\sl J. f|spose 4ur die reine und angew.\
Math.\ \bf 39} (1850), 166--167: Let $p$ be prime and let $u(x)
= u↓nx↑n +\cdots + u↓0$ have the following properties:
(i) $u↓n$ is not divisible by $p$; (ii) $u↓{n-1}, \ldotss$, $u↓0$
are divisible by $p$; (iii) $u↓0$ is not divisible by $p↑2$.
Show that $u(x)$ is irreducible over the integers.

\exno 20. [M28] If $u(x) = u↓nx↑n +\cdots + u↓0$
is any polynomial over the complex numbers, let $|u| = (|u↓n|↑2
+\cdots + |u↓0|↑2)↑{1/2}$.\xskip (a) Let $g(x) = (x - α)u(x)$
and $h(x) = (|spose 3αx - 1)u(x)$, where $α$ is any complex
number. Prove that $|g| = |h|$.\xskip (b) Let the complete factorization
of $u(x)$ over the complex numbers be $u(x) = u↓n(x - α↓1) \ldotsm
(x - α↓n)$, and write $M(u) = ??7↓{1≤j≤n}$ max(1, $α↓j|)$. Prove
that $M(u) ≤ |u|/|u↓n|$.\xskip (c) Show that $|u↓j| ≤ |u↓n||newtype
(|newtype ({n-1\atop j})M(u) + ({n-1\atop j-1})|newtype ) |newtype$
for $0 ≤ j ≤ n$.\xskip (d) Combine these results to prove that if
$u(x) = v(x)w(x)$ and $v(x) = v↓mx↑m +\cdots + v↓0$,
where $u, v, w$ all have integer coefficients, the coefficients
of $v$ are bounded by
$$$|v↓j| ≤ ({m-1\atop j}|u| + ({m-1\atop j-1})|u↓n|$.

\exno 21. [HM30] The purpose of this
exercise is to obtain useful bounds on the coefficients of {\sl
multivariate} polynomials factors over the integers. Given a
polynomial $u(x↓1, \ldotss , x↓t)$ over the complex numbers,
let $|u|$ be $(\sum |u↓{j, \ldotss , j↓t}|↑2)↑{1/2}$ summed over
all the coefficients. Let $e(x) = e↑{2πix}$.\xskip (a) Prove that
$$$|u| = \int ↑{1}↓{0} \cdots \int ↑{1}↓{0}|u|newtype (|newtype
e(\varphi ), \ldotss$, e(\varphi ↓t)|newtype )|newtype |↑2 d\theta
↓t \ldotsm d\theta ↓1$.\xskip
(b) Let $u(x) = v(x)w(x)$, where deg$(v) = m$ and
deg$(w) = k$. Use the results of exercise 20 to show that $|v||w|
≤ f(m, k))↑{1/3}|u|$, where $f(m, k) = ({2m\atop m})({2k\atop
k})$.\xskip (c) Let $u(x↓{1}, \ldotss , x↓t) = v(x↓1, \ldotss , x↓t)w(x↓1,
\ldotss , x↓t)$, where $v$ and $w$ have the respective degrees
$m↓j$ and $k↓j$ in $x↓j$. Prove that
$$$|v||w| ≤ |newtype (|newtype f(m↓1, k↓1) \ldotsm f(m↓t, k↓t)|newtype
)|newtype ↑{1/2}|u|$.

\trexno 22. [M24] ({\sl Hensel's Lemma.})\xskip
Let $u(x), v↓e(x), w↓e(x), a(x), b(x)$ be polynomials with integer
coefficients, satisfying the relations
|qleft |newtype \6skip $u(x) ≡ v↓e(x)w↓e(x)\quad\modulo{p↑e},\qquad
a(x)v↓e(x) + b(x)w↓e(x) ≡ 1\quad \modulo p$,
|qctr \6skip where $p$ is prime, $e ≥ 1, v↓e(x)$ is monic, deg($a)
<$ deg($w↓e)$, deg$(b) <$ deg($v↓e)$, and deg$(u) =$ deg$(v↓e)
+$ deg($w↓e)$. Show how to compute polynomials $v↓{e+1}(x) ≡
v↓e(x)$ and $w↓{e+1}(x) ≡ w↓e(x)$ (modulo $p↑e)$, satisfying
the same conditions with $e$ increased by 1. Furthermore, prove
that $v↓{e+1}(x)$ and $w↓{e+1}(x)$ are unique, modulo $p↑{e+1}$.
|qleft \qquad Use your method for $p = 2$ to prove that (21)
is irreducible over the integers, starting with its factorization
mod 2 found in exercise 10. [Note that Euclid's extended algorithm,
exercise 4.6.1--3, will get the process started for $e = 1.]$

\exno 23. [HM23] Let $u(x)$ be a primitive polynomial with integer
coefficients that is ``squarefree.''\xskip$\biglp$Cf.\ exercise 2;
the latter condition is equivalent to saying that gcd\biglp
$u(x),u↑\prime (x)\bigrp = 1$.$\bigrp$\xskip Prove that there are only
finitely many primes $p$ such that $u(x)$ is not squarefree
modulo $p$.

\exno 24. [M20] The text speaks only of factorization over the
integers, not over the field of rational numbers. Explain how
to find the complete factorization of a polynomial with rational
coefficients, over the field of rational numbers.

\exno 25. [M25] What is the complete factorization of $x↑5 +
x↑4 + x↑2 + x + 2$ over the field of rational numbers?

\exno 26. [20] Let $d↓1, \ldotss$, $d↓r$ be the degrees of the
irreducible factors of $u(x)$ modulo $p$, with proper multiplicity,
so that $d↓1 +\cdots + d↓r = n =$ deg$(u)$. Explain
how to compute the set \{deg$(v) |newtype | |newtype u(x) ≡
v(x)w(x)$ modulo $p$ for some $v(x), w(x)\}$ by performing $O(r)$
operations on binary bit strings of length $n$.

\exno 27. [HM30] Prove that a random primitive polynomial over
the integers is ``almost always'' irreducible, in soje appropriate
sense.

\exno 28. [M18] True or false: If the complete factorization
of $u(x)$ modulo $p$ is $p↓1(x)↑e|infsup 1 \ldotss p↓r(x)↑e|infsup
r$ and if $u(x) ≠ 0$, then $u(x)/$gcd\biglp $u(x), u↑\prime
(x)\bigrp = p↓1(x) \ldotsm p↓r(x)$.

\trexno 29. [M21] (J. Moses and D. Y. Y. Yun.)\xskip Given an algorithm
for evaluating $d(x) =$ gcd\biglp $u(x), v(x)\bigrp$ subject
to condition (26), show how to compute the gcd in general (for
polynomials over the integers).

\exno 30. [M25] What is the probability that the distinct degree
factorization will completely factor a random polynomial of
degree $n$, modulo $p$, for fixed $n$ as $p → ∞?$

\exno 31. [M47] (V. Pratt.)\xskip If possible, find a way to construct
proofs of irre|newtype 58320---Computer Programming\quad (A.-
%folio 570 galley 16 (C) Addison-Wesley 1978
W./Knuth)\quad ch. 4\quad f. 570\quad g. 16B

|qleft \18skip {\:r 4.6.3 Evaluation of Powers

|qleft }\6skip In this section we shall study the interesting
problem of computing $x↑n$ efficiently, given $x$ and $n$, where
$n$ is a positive integer. Suppose, for example, that we are
asked to compute $x↑{16}$; we could simply start with $x$ and
multiply by $x$ fifteen times. But is is possible to obtain
the same answer with only four multiplications, if we repeatedly
take the square of each partial result, successively forming
$x↑2, x↑4, x↑8, x↑{16}$.

The same idea applies, in general, to any value
of $n$, in the following way: Write $n$ in the binary number
system (suppressing zeros at the left). Then replace each ``1''
by the pair of letters SX, replace each ``0'' by S, and cross
off the ``SX'' that now appears at the left. The result is
a rule for computing $x↑n$, if ``S'' is interpreted as the operation
of {\sl squaring}, and if ``X'' is interpreted as the operation
of {\sl multiplying by x.} For example, if $n = 23$, its binary
representation is 10111; so we form the sequence SX S SX SX
SX and remove the leading SX to obtain the rule SSXSXSX. This
rule states that we should ``square, square, multiply by $x$,
square,, multiply by $x$, square, and multiply by $x$''; in
other words, we should successively compute $x↑2, x↑4, x↑5,
x↑{10}, x↑{11}, x↑{22}, x↑{23}$.

This ``binary method'' is easily justified by a
consideration of the sequence of exponents in the calculation:
If we reinterpret ``S'' as the operation of multiplying by 2
and ``X'' as the operation of adding 1, and if we start with
1 instead of $x$, the rule will lead to a computation of $n$
because of the properties of the binary number system. The method
is quite ancient; it appeared before 200 |newtype B|newtype
.|newtype C|newtype . in Pingala's Hindu classic Chandah-s|spose
7utra [see B. Datta and A. N. Singh, {\sl History of Hindu Mathematics}
{\bf 1} (Bombay, 1935), 76]; however, there seem to be no other
references to this method outside of India during the next 2000
years!

The S-and-X binary method for obtaining $x↑n$ requires
no temporary storage except for $x$ and the current partial
result, so it is well suited for incorporation in the hardware
of a binary computer. The mthod can also be readily programmed
for either binary or decimal computers; but it requires that
the binary representation of $n$ be scanned from left to right,
while it is usually more convenient to do this from right to
left. For example, with a binary computer we can shift the binbary
value of $n$ to the right one bit at a time until zero is reached;
with a decimal computer we can divide by 2 (or, equivalently,
multiply by 5 or ${1\over 2}$) to deduce the binary representation
from right to left. Therefore the following algorithm , based
on a right-to-left scan of the number, is often more convenient:→

\algbegin Algorithm A (Right-to-left binary
method for exponentiation). This algorithm evaluates
$x↑n$, where $n$ is a positive integer. (Here $x$ belongs to
any algebraic system in which an associative multiplication,
with identity element 1, has been defined.)

|qleft \6skip |\caption Fig.\ 12. Evaluation of $x↑n$,
based on a right-to-left scan of the binary notation for $n$.

\algstep A1. [Initialize.] Set $N
← n, Y ← 1, Z ← x$.

\algstep A2. [Halve {\sl N.]} (At this point, we have the relation
$x↑n = Y \cdot Z↑n.)$ Set $N ← \lfloor N/2\rfloor$, and at the
same time determine whether $N$ was even or odd. If $N$ was
even, skip to step A5.

\algstep A3. [Multiply $Y$ by {\sl Z.]} Set $Y ← Z$ times $Y$.

\algstep A4. [$N = 0$?] If $n = 0$, the algorithm
terminates, with $Y$ as the answer.

\algstep A5. [Square {\sl Z.]} Set $Z ← Z$ times $Z$, and return
to step A2.

|qleft \12skip |cancelindent \qquad As an example of Algorithm
A\null, consider the steps in the evaluation of $x↑{23}:$
$$|tab \qquad \qquad \qquad \qquad \quad |tab \qquad \quad |tab
\qquad \quad |tab \qquad \quad |tab |zfa ⊗⊗N⊗Y⊗Z\cr
\3skip After step A1⊗23⊗ \xskip 1⊗ \xskip $x\cr$
After step A4⊗11⊗ \xskip $x⊗ \xskip x\cr$
After step A4⊗ 5⊗ \xskip $x↑3⊗ x↑2\cr$
After step A4⊗ 2⊗$ \xskip x↑7⊗ \xskip x↑4\cr$
After step A4⊗ 0⊗$ \xskip x↑{23}⊗ \xskip x↑{16}\cr
\6skip$ A \MIX\ program corresponding to Algorithm A appears in
exercise 2.

|qleft \6skip \quad The great calculator al-Kash7i stated Algorithm
A about 1414 |newtype A|newtype .|newtype D|newtype . [{\sl
Istoriko-Mat.\ Issledovani\t\ia \bf 7} (1954), 256--257]. It is
closely related to a procedure for multiplication that was
used by Egyptial mathematicians as early as 1800 |newtype B|newtype
.|newtype C|newtype . If we change step A3 to $``Y ← Y + Z$''
and step A5 to ``$Z ← Z + Z$'', and if we set $Y$ to zero instead
of unity in step A1, the algorithm terminates with $Y = nx$.
This is a practical method for multiplication by hand, since
it involves only the simple operations of doubling, halving,
and adding. It is often called the ``Russian peasant method''
of multiplication, since Western visitors to Russia in the nineteenth
century found the method in wide use there.

The number of multiplications required by Algorithm
A is \lfloor lg $n\rfloor + \nu (n)$, where $\nu (n)$ is the
number of ones in the binary representation of $n$. This is
one more multiplication than the left-to-right binary method
mentioned at the beginning of this section would require, due
to the fact that the first execution of step A3 is simply a
multiplication by unity.

Because of the bookkeeping time required by this
algorithm, the binary method is usually not of importance for
small values of $n$, say $n ≤ 10$, unfiess the time for a multiplication
is comparatively large. If the value of $n$ is known in advance,
the left-to-right binary method is preferable. In some situations,
such as the calculation of $x↑n$ mod $u(x)$ discussed in Section
4.6.2, it is much easier to multiply by $x$ than to perform
a general multiplication or to square a value, so binary methods
for exponentiation are primarily suited for quite large $n$
in such cases. If we wish to calculate the exact multiple-precision
value of $x↑n$, when $x$ is an integer >1, binary methods are
no help unless $n$ is so huge that the high-speed multiplication
routines of Section 4.3.3 are involved; and such applications
are rare. Similarly, binary methods are usually inappropriate
for raising a polynomial to a power; see R. J. Fateman, {\sl
SIAM J. Computing \bf 3} (1974), 196--213, for a discussion of
the extensive literature on polynomial exponentiation. The point
of these remarks is that binary methods are nice, but not a
panacea. They are most applicable when the time to multiply
$x↑j \cdot x↑k$ is essentially independent of $j$ and $k$ (e.g.,
for floeating-point multiplication, multiplication mod $m)$;
then the running time is reduced from order $n$ to order log
$n$.

\subsectionbegin{Fewer multiplications} Several authors
have published statements (without proof) that the binary method
actually gives the {\sl minimum} possible number of multiplications.
But this is not rue. The smallest counterexample is $n = 15$,
when the binary method needs 6 multiplications, yet we can calculate
$y = x↑3$ in two multiplications and $x↑{15} = y↑5$ in three
more, achieving the desired result with only 5 multiplications.
Let us now discuss some other procedures for evaluating $x↑n$,
which are appropriate in situations (e.g., within an optimizing
compiler) when $n$ is known in advance.

The {\sl factor method} is based on a factorization
of $n$. If $n = pq$, where $p$ is the smallest prime factor
of $n$ and $q > 1$, we may calculate $x↑n$ by first calculating
$x↑p$ and then raising this quantity to the $q$th power. If
$n$ is prime, we may calculate $x↑{n-1}$ and multiply by $x$.
And, of course, if $n = 1$, we have $x↑n$ with no calculation
at all. Repeated application of these rules gives a procedure
for evaluating $x↑n$, for any given $n$. For example, if we
want to calculate $x↑{55}$, we first evaluate $y = x↑5 = x↑4x
= (x↑2)↑2x$; then we form $y↑{11} = y↑{10}y = (y↑2)↑5y$. The
whole process takes eight multiplications, while the binary
method would have required nine. The factor method is better
than the binary method on the average, but there are cases $(n
= 33$ is the smallest example) where the binary method excels.

The binary method can be generalized to an $m-ary
method$ as follows: Let $n = d↓0m↑t + d↓1m↑{t-1} +\cdots
+ d↓t$, where $0 ≤ d↓j < m$ for $0 ≤ j ≤ t$. The computation
begins by forming $x, x↑2, x↑3, \ldotss , x↑{m-1}$. (Actually,
only those powers $x↑d|infsup j$, for $d↓j$ in the representation
of $n$, are needed, and this observation often saves some of
the work.) Then raise $x↑d|infsup 0$ to the $m$th power and
multiply by $x↑d|infsup 1$; we have computed $y↓1 = x↑d|infsup
0↑{m+d}|infsup 1$. Next, raise $y↓1$ to the $m$th power and
multiply by $x↑d|infsup 2$, obtaining $y↓2 = x↑{d↓0m↑2+d↓1m+d↓2}$.
The process continues in this way until $y↓t = x↑n$ has been
computed. Whenever $d↓j = 0$,
%folio 574 galley 17 WARNING: Much of this tape unreadable! (C) Addison-Wesley 1978
it is, of course, unnecessary to multiply by $x↑d|infsup j$.
Note that this method reduces to the left-to-right binary method
discussed earlier, when $m = 2$; but no right-to-left $m$-ary
method will give as few multiplications when $m > 2$. If $m$
is a small prime, the $m$-ary method will be particularly efficient
for calculating powers of one polynomial 58320---Computer Programming\quad
(A.-W./Knuth)\quad ch. 4\quad f. 574\quad g. 17B

|qleft \12skip \quad A systematic method that gives the minimum
number of multiplications for all of the relatively small values
of $n$ (in particular, for most $n$ that occur in practical
applications) is indicated in Fig.\ 13. To calculate $x↑n$, find
$n$ in this tree, and the path from the root to $n$ indicates
the sequence of expondnts whic occur in an efficient evaluation
of $x↑n$. The rule for generating this ``power tree'' appears
in exercise 5. Computer tests have shown that the power tree
gives optimal results for all of the $n$ listed in the figure.
But for large enough values of $n$ the power tree method is
not always an optimal procedure; the smallest examples are $n
= 77, 154, 233$. The first case for which the power tree is
superior to both the binary method and the factor method is
$n = 23$.

CAPTION FOR FIG. 13 NEEDED

\subsectionbegin{Addition chains} The most
economical way to compute $x↑n$ by multiplication is a mathematical
problem with an interesting history. We shall now examine it
in detail, not only because it is interestin its own right,
but because it is an excellent example of the theoretical questions
that arise in a study of ``optimal methods of computation.''

Although we are concerned with multiplication of
powers of $x$, the problem can easily \6skip $l(mn) ≤ l(m) +
l(n),\eqno (3)$
|qctr \6skip since we can take the chains 1, $a↓1, \ldotss$,
a↓r = m$ and $1, b↓1, \ldotss$, $b↓s = n$ and form the chain 1,
$a↓1, \ldotss$, a↓r, a↓rb↓1, \ldotss$, $a↓rb↓s = mn$.

We can also recast the $m$-ary method into addition-chain
terminology. consider the case $m = 2↑k$, and write $n = d↓0m↑t
+ d↓1m↑{t-1} +\cdots + d↑t$ in the m-ary number system;
the corresponding addition chain takes the form
$$\quad 1, 2, 3, \ldotss$, $m - 2, m - 1$,
$$\qquad \quad 2d$↓0, 4d↓0, \ldotss$, md↓0, md↓0 + d↓1$,
$$\qquad \qquad 2(md$↓0 + d↓1), 4(md↓0 + d↓1), \ldotss , m(md↓0
+ d↓1), m↑2d↓0 + md↓1 + d↓2$,
|qright \3skip \qquad \qquad \qquad \quad \ldotss ,\qquad \qquad
m$↑td↓0 + m↑{t-1}d↓1 +\cdots + d↓t.\qquad \qquad
(4)$
|qright \6skip The length of this chain is $m - 2 + (k + 1)t$;
this number can often be reduced by deleting certain elements
of the first row that do not occur among the cofficients $d↓j$,
plus elements among $2d↓0, 4d↓0, . . $. that already appear
in the first row; and whenever a $d↓j$ is zero, the step at
the right end of the corresponding line may be dropped.

The simplest case of the $m$-ary method is the
binary method $(m = 2)$, when the general scheme (4) simplifies
to the ``S'' and ``X'' rule mentioned at the beginning of this
section: The binary addition chain for 2$n$ is the binary chain
for $n$ followed by $2n$; for $2n + 1$ it is the binary chain
for $2n$ followed by $2n + 1$. From the binary method we conclude
that
$$$l(2↑e|infsup 0 + 2↑e|infsup 1 +\cdots + 2↑e|infsup
t) ≤ e↓0 + t,\qquad$ if\qquad $e↓0 > e↓1 >\cdots
> e↓t ≥ 0.\eqno (5)$
|qctr \6skip \qquad Let us now define auxiliary functions for
convenience in our subsequent discussion:
$$
|qctr \hfill λ(n) ⊗= \lfloor lg $n\rfloor ;\eqno (6)\cr
\4skip \hfill \nu (n) ⊗=$ number of 1's in the binary representation
of $n.\eqno (7)\cr
\6skip$ Thus $λ(17) = 4, \nu (17) = 2$; these functions may
be defined by the recurrence relations
|qleft
 |qctr \6skip \hfill λ(1) ⊗= 0,\hfill λ(2n) ⊗= λ(2n + 1) = λ(n)
+ 1;\eqno (8)\cr
\4skip \hfill \nu (1) ⊗= 1,\hfill \nu (2n) ⊗= \nu (n),\qquad
\nu (2n + 1) = \nu (n) + 1.\eqno (9)\cr
\6skip In terms of these functions, the binary addition chain
for $n$ requires $λ(n) + \nu (n) - 1$ steps, and (5) becomes
$$$l(n) ≤ λ(n) + \nu (n) - 1.\eqno (10)$

\subsectionbegin{Special classes of chains} We may
assume without any loss of generality that an addition chain
is ``ascending,''

|qleft \6skip 1 = a$↓0 < a↓1 < a↓2 <\cdots < a↓r
= n;\eqno (11)$
|qctr \6skip for if any two $a$'s are equal, one of them may
be fropped; and we can also rearrange the sequence (1) into
ascending order and remove terms >$n$ without destroying the
addition chain property (2). {\sl From now on we shall consider
only acending chains}, without explicitly mentioning this assumption.

It is convenient at this point to define a few
special terms relating to addition chains. By definition we
have, for $1 ≤ i ≤ r$,
$$a$↓i = a↓j + a↓k\eqno (12)$
|qctr \6skip for some $j$ and $k, 0 ≤ k ≤ j < i$. Let us say
that step $i$ of (11) is a {\sl doubling}, if $j = k = i - 1$;
then $a↓i$ has the maximum possible value $2a↓{i-1}$ that can
follow the ascending chain $1, a↓1, \ldotss$, $a↓{i-1}$. If $j$
(but not necessarily $k)$ equals $i - 1$, let us say that step
$i$ is a {\sl star step.} The importance of star steps is explained
below. Finally let us say that step $i$ is a {\sl small step}
if $λ(a↓i) = λ(a↓{i-1})$. Since $a↓{i-1} < a↓i ≤ 2a↓{i-1}, λ(a↓i)$
is always equal to either $λ(a↓{i-1})$ or $λ(a↓{i-1}) + 1$;
it follows that, in any chain (11), {\sl the length r is equal
to λ(n) plus the number of small steps.}
|qleft β?????????|newtype 58320---Computer Programming\quad
%folio 579 galley 18 (C) Addison-Wesley 1978
(A.-W./Knuth)\quad ch. 4\quad f. 579\quad g. 18B

|qleft \12skip \quad Several elementary relations hold between
these types of steps: Step 1 is always a doubling. A doubling
obviously is a star step, but never a small step. A doubling
must be followed by a star step. Furthermore if step $i$ is
{\sl not} a small step, then step $i + 1$ is either a small
step or a star step, or both; putting this another way, if step
$i + 1$ is neither small nor star, step $i$ must be small.

A {\sl star chain} is an addition chain that involves
only star steps. This means that each term $a↓i$ is the sum
of $a↓{i-1}$ and a previous $a↓k$; the simple ``computer'' discussed
above after Eq.\ (2) makes use only of the two operations STA
and ADD (no LDA) in a star chain, since each new term of the
sequence utilizes the preceding result in the accumulator. Most
of the addition chains we have discussed so far are star chains.
The minimum length of a star chain for $n$ is denoted by $l??(n)$;
clearly
$$$l(n) ≤ l??(n).\eqno (13)$
|qctr \6skip \quad We are now ready to derive some nontrivial
facts about addition chains. First we can show that there must
be fairly many doublings if $r$ is not far from $λ(n):$

\thbegin Theorem A. {\sl If the addition chain $(11)$ includes
$d$ doublings and $f = r - d$ nondoublings, then}
$$n ≤ 2$↑{d-1}F↓{f+3}.\eqno (14)$
|qctr \6skip Proof. \xskip By induction on $r = d + f$, we see
that (14) is certainly true when $r = 1$. When $r > 1$, there
are three cases: If step $r$ is a doubling, then ${1\over 2}$$n
= a↓{r-1} ≤ 2↑{d-2}F↓{f+3}$; hence (14) follows. If steps $r$
and $r - 1$ are both nondoublings, then $a↓{r-1} ≤ 2↑{d-1}F↓{f+2}$
and $a↓{r-2} ≤ 2↑{d-1}F↓{f+1}$; hence $n = a↓r ≤ a↓{r-1} + a↓{r-2}
≤ 2↑{d-1}(F↓{f+2} + F↓{f+1}) = 2↑{d-1}F↓{f+3}$ by the definition
of the Fibonacci sequence. Finally, if step $r$ is a nondoubling
but step $r - 1$ is a doubling, then $a↓{r-2} ≤ 2↑{d-2}F↓{f+2}$
and $n = a↓r ≤ a↓{r-1} + a↓{r-2} = 3a↓{r-2}$. Now $2F↓{f+3}
- 3F↓{f+2} = F↓{f+1} - F↓f ≥ 0$; hence $n ≤ 2↑{d-1}F↓{f+3}$
in all cases.

|qleft \12skip \quad The method of proof we have used shows
that inequality (14) is ``best possible'' under the stated assumptions;
for the addition chain
$$$1, 2, \ldotss , 2↑{d-1}, 2↑{d-1}F↓3, 2↑{d-1}F↓4, \ldotss ,
2↑{d-1}F↓{f+3}\eqno (15)$
|qctr \6skip has $d$ doublings and $f$ nondoublings.

\thbegin Corollary. If the addition
chain (11) includes f nondoublings and s small steps, then}
$$s ≤ f ≤ 3.271s.\eqno (16)
|qctr \6skip Proof. \xskip Obviously $s ≤ f$. We have $2↑|≤l↑{(n)}
≤ n ≤ 2↑{d-1}F↓{f+3} ≤ 2↑d\phi ↑f = 2↑{λ(n)+s}(\phi /2)↑t$,
since $d + f = λ(n) + s$, and since $F↓{f+3} ≤ 2\phi ↑f$ when
$f ≥ 0$. Hence $0 ≤ s$ ln 2 + f ln($\phi /2)$, and (16) follows
from the fact that
$$ln 2/ln(2/$\phi ) \approx 3.2706$.
\caption?     {\bf Values of} $\bf l(n) for special} $\bf n. \xskip}
It is easy to show by induction that $a↓i ≤ 2↑i$, and therefore
lg $n ≤ r$ in any addition chain (11). Hence
$$$l(n) ≥ \lceil$ lg $n\rceil .\eqno (17)$
|qctr \6skip This lower bound, together with the upper bound
(10) given by the binary method, gives us the values
$$
|qctr \hfill l(2$↑A) ⊗= A;\eqno (18)\cr
\4skip \hfill l(2↑A + 2↑B) ⊗= A + 1,\qquad$ if\qquad $A > B.\eqno
(19)\cr
\6skip$ Thus when $\nu (n) ≤ 2$, the binary method is opetimal.
With some further calculation we can extend these formulas to
the case $\nu (n) = 3:$

\thbegin Theorem B.
|qleft }\6skip $l(2↑A + 2↑B + 2↑C) = A + 2,\qquad$ if\qquad
$A > B > C.\eqno (20)$
|qctr \6skip Proof. \xskip We can, in fact, prove a stronger
result that will be of use to us later in this section:
{\sl All addition chains with exactly one small step have one
of the following six types} (where all steps indicated by ``$\ldots$''
represent doublings):
$$|indent \quad {\bf Type 1. \xskip} 1, \ldotss, 2$↑|εA, 2↑A
+ 2↑B, \ldotss , 2↑{A+C} + 2↑{B+C}; A > B ≥ 0, C ≥ 0$.

{\bf Type 2. \xskip} 1, \ldotss , 2$↑A, 2↑A + 2↑B,
2↑{A+1} + 2↑B, \ldotss , 2↑{A+C+1} + 2↑{B+C}; A > B ≥ 0, C ≥
0$.

|qleft \2skip {\bf Type 3. \xskip} 1, \ldotss , 2$↑A, 2↑A + 2↑{A-1},
2↑{A+1} + 2↑{A-1}, 2↑{A+2}, \ldotss , 2↑{A+C}; A
> 0, C ≥ 2$.

|qleft \2skip {\bf Type 4. \xskip} 1, \ldotss , $2↑A, 2↑A + 2↑{A-1},
2↑{A+1} + 2↑A, 2↑{A+2}, \ldotss , 2↑{A+C}; A > 0, C ≥ 2$.

|qleft \2skip {\bf Type 5. \xskip} 1, \ldotss , 2$↑A, 2↑A + 2↑{A-1},
\ldotss , 2↑{A+C} + 2↑{A+C-1}, 2↑{A+C+1} + 2↑{A+C-2}, \ldotss
, 2↑{A+C+D+1} + 2↑{A+C+D-2}; A > 0, C > 0, D ≥ 0$.

|qleft \2skip {\bf Type 6. \xskip} 1, \ldotss , 2$↑A, 2↑A + 2↑B,
2↑{A+1}, \ldotss , 2↑{A+C}; A > B ≥ 0, C ≥ 1$.

|qleft \12skip |cancelindent \quad A straightforward hand calculation
shows that these six types exhaust all possibilities. (Note
that, by the corollary to Theorem A\null, there are at most three
nondoublings when there is one small step; this maximum of three
is attained only in sequences of type 3. All of the above are
star chains, except type 6 when $B < A - 1.)$

The theorem now follows from the observation that
$l(2↑A + 2↑B + 2↑C) ≤ A + 2$; and it must be greater than $A
+ 1$ since none of the six possible types have $\nu (n) > 2$.

(E. de Jonqui|spose 2eres had stated without proof
in 1894 that $l(n) ≥ λ(n) + 2$ when $\nu (n) > 2$; the first
published demonstration of Theorem B was by A. A. Gioia, M.
V. Subbarao, and M. Sugunumma in {\sl Duke Math.\ J. \bf 29}
(1962), 481--487.)

The calculation of $l(2↑A + 2↑B + 2↑C + 2↑D)$,
when $A > B > C > D$, is more involved; by the binary method
it is at most $A + 3$, and by the proof of Theorem B it is at
least $A + 2$. The value $A + 2$ is possible, since we know
that the binary method is not optimal when $n = 15$ or $n =
23$. The complere behavior when $\nu (n) = 4$ can be determined,
as we shall now see.

\thbegin Theorem C. {\sl If $\nu (n) ≥ 4$ then
$l(n) ≥ λ(n) + 3$, except in the following circumstances when
A > B > C > D and l(2↑A + 2↑B + 2↑C + 2↑D) equals A + 2:$
$$\quad Case {\sl 1. \xskip }A - B = C - D. (Example: $n = 15.)$

Case {\sl 2. \xskip }A - B = C - D + 1. (Example:
$n = 23.)$

Case {\sl 3. \xskip }A - B = 3, C - D = 1. (Example:
$n = 39.)$

Case {\sl 4. \xskip }A - B = 5, B - C = C - D =
1. (Example: $n = 135.)$

|qleft \6skip Proof. \xskip When $l(n) = λ(n) + 2$, there is
an addition chain for $n$ having just two small steps; such
an addition chain starts out as one of the six types in the
proof of Theorem B\null, followed by a small step, followed by a
sequence of nonsmall steps. Let us say that $n$ is ``special''
if $n = 2↑A + 2↑B + 2↑C + 2↑D$ for one of the four cases listed
in the theorem. We can obtain addition chains of the required
form for each special $n$, as shown in exercise 13; therefore
it remains for us to prove that no chain with exactly two small
steps contains any elements with $\nu (a↓i) ≥ 4$ except when
$a↓i$ is special.

Let a ``counterexample chain'' be an addition chain
with two small steps such that $\nu (a↓r) ≥ 4$, but $a↓r$ is
not special. If counterexample chains exist, let $1 = a↓0 <
a↓1 <\cdots < a↓r = n$ be a counterexample chain
of shortest possible length. Then step $r$ is not a small step,
since none of the six types in the proof of Theorem B can be
followed by a small step with $\nu (n) ≥ 4$ except when $n$
is special. Furthermore, step $r$ is not a doubling, otherwise
$↓{a0}, \ldotss$, $a↓{r-1}$ would be a shorter counterexample
chain; and step $r$ is a star step, otherwise $a↓0, \ldotss$,
a↓{r-2}, a↓r$ would be a shorter counterexample chain. Thus
|qleft $\6skip a↓r = a↓{r-1} + a↓{r-k},\qquad$ and\qquad $λ(a↓r)
= λ(a↓{r-1}) + 1.\eqno (21)$
|qctr β?????????|newtype 58320---Computer Programming\quad (A.-W./K
%folio 584 galley 19 (C) Addison-Wesley 1978
nuth)\quad ch. 4\quad f. 584\quad g. 19B

|qleft \12skip \quad Let $c$ be the number of carries that
occur when $a↓{r-1}$ is added to $a↓{r-k}$ in the binary number
system by Algorithm 4.3.1A\null. Then we have the fundamental relation
$$$\nu (a↓r) = \nu (a↓{r-1}) + \nu (a↓{r-k}) - c.\eqno (22)$
|qctr \6skip We can now prove that {\sl step r - 1 is not a
small step (}see exercise 14).

Let $m = λ(a↓{r-1})$. Since neither $r$ nor $r
- 1$ is a small step, $c ≥ 2$; and $c = 2$ can hold only when
$a↓{r-1} ≥ 2↑m + 2↑{m-1}$.

Now let us suppose that $r - 1$ is not a star step;
then $r - 2$ is a small step, and $a↓0, \ldotss$, a↓{r-3}, a↓{r-1}$
is a chain with only one small step;→ hence $\nu (a↓{r-1}) ≤
2$ and $\nu (a↓{r-2}) ≤ 4$. The relation (22) can now hold only
if $\nu (a↓r) = 4, \nu (a↓{r-1}) = 2, k = 2, c = 2, \nu (a↓{r-2})
= 4$. From $c = 2$ we conclude that $a↓{r-1} = 2↑m + 2↑{m-1}$;
hence $a↓0, a↓1, \ldotss$, a↓{r-3} = 2↑{m-1} + 2↑{m-2}$ is an
addition chain with only one small step, and it must be of Type
1 so $a↓r$ belongs to Case 3. Thus $r - 1$ is a star step.

Now assume that $a↓{r-1} = 2↑ta↓{r-k}$ for some
$t$. If $\nu (a↓{r-1}) ≤ 3$, then by (22), $c = 2, k = 2$, and
we see that $a↓r$ must belong to Case 3. If $\nu (a↓{r-1}) =
4$ then $a↓{r-1}$ is special, and it is easy to see by considering
each case that $a↓r$ also belongs to one of the four cases.
(Case 4 arises, for example, when $a↓{r-1} = 90, a↓{r-k} = 45$;
or $a↓{r-1} = 120, a↓{r-k} = 15.)$ Therefore we may conclude
that $a↓{r-1} ≠ 2↑ta↓{r-k}$ for any $t$.

Let us now suppose that $λ(a↓{r-k}) = m -1$; the
case $λ(a↓{r-k}) < m - 1$ may be ruled out by similar arguments,
as shown in exercise 14. If $k = 4$, both $r - 2$ and $r - 3$
are small steps; hence $a↓{r-4} = 2↑{m-1}$, and (22) is impossible.
Therefore $k = 3$; step $r - 2$ is small, $\nu (a↓{r-3}) = 2,
c = 2, a↓{r-1} ≥ 2↑m + 2↑{m-1}$, and $\nu (a↓{r-1}) = 4$. There
must be at least two carries when $a↓{r-2}$ is added to $a↓{r-1}
- a↓{r-2}$; hence $\nu (a↓{r-2}) = 4$, and $a↓{r-2}$ (being
special and $≥{1\over 2}a↓{r-1})$ has the form $2↑{m-1} + 2↑{m-2}
+ 2↑{d+1} + 2↑{d+1} + 2↑d$ for some $d$. Now $a↓{r-1}$ is either
$2↑m + 2↑{m+1} + 2↑{d+1} + 2↑d$ or $2↑m + 2↑{m-1} + 2↑{d+2}
+ 2↑{d+1}$, and in both cases $a↓{r-3}$ must be $2↑{m-1} + 2↑{m-2}$,
so $a↓r$ belongs to Case 3.

E. G. Thurber [{\sl Pacific J. Math.\ \bf 49} (1973),
229--242] has extended Theorem C to show that $l(n) ≥ λ(n) +
4$ when $\nu (n) > 8$, and it seems reasonable to conjecture
that $l(n) ≥ λ(n) +$ lg $\nu (n)$ in general, since A. Sch|spose
4onhage has come very close to proving this (see exercise 29).

\subsectionbegin{Asymptotic values} Theorem C
indicates that it is probably quite difficult to get exact values
of $l(n)$ for large $n$, when $\nu (n) > 4$; however, we can
determine the approximate behavior in the limit as $n ← ∞$.

\algbegin Theorem D (\rm A. Brauer, {\sl Bull.\ Amer.\ Math.\ Soc.\
\bf 45} (1939), 736--739).

|qleft \6skip \mathop{lim↓{$n→∞} l??(n)/λ(n) =$ \mathop{lim↓{$n→∞}
l(n)/λ(n) = 1.\eqno (23)$
|qctr \6skip Proof. \xskip The addition chain (4) for the $2↑k-ary$
method is a star chain if we delete the second occurrence of
any element that appears twice in the chain; for if $a↓i$ is
the first element among $2d↓0, 4d↓0, . . $. of the second line
that is not present in the first line, we have $a↓i ≤ 2(m -
1)$; hence $a↓i = (m - 1) + a↓j$ for some $a↓j$ in the first
line. By totaling up the length of the chain, we therefore have
$$$λ(n) ≤ l(n) ≤ l??(n) < \left(1 + {1\over k}}$ lg $n + 2↑k\eqno
(24)$
|qctr \6skip for all $k ≥ 1$. The theorem follows if we choose,
say, $k = \lfloor {1\over 2}$ lg $λ(n)\rfloor $.
|qctr The second term $λ(n)/λλ(n)$ is essentially the best that
can be obtained from (24). A much deeper analysis of lower bounds
can be carried out, to show this term $λ(n)/λλ(n)$ is, in fact,
essential in (25). In order to see why this is so, let us consider
the following fact:

\algbegin Theorem E (\rm Paul Erd\H os, {\sl Acta Arithmetica
\bf 6} (1960), 77--81). {\sl Let $ε$ be a positive real number. The
number of addition chains (11) such that
$$λ(n) = m,\qquad r ≤ m + (1 - ε)m/λ(m)\eqno (26)
|qctr \6skip is less than α$↑m, for some α < 2, for all suitably
large m$. (In other words, the number of addition chains
so short that (26) is satisfied is substantially less than
the number of values of $n$ such that $λ(n) = m$, when $m$ is
large.)

|qleft \12skip {\sl Proof. \xskip} We want to estimate the number
of possible addition chains, and for this purpose our first
goal is to get an improvement of Theorem A that enables us
to deal more satisfactorily with nondoublings:

\thbegin Lemma A. {\sl Let $\delta < ??U20}2?? - 1 be a
fixed positive real number. Call step i of an addition chain
a ``ministep'' if a↓i < a↓j(1 + \delta )↑{i-j} for some j, 0
≤ j < i. If the addition chain contains s small steps and t
ministeps, then}
$$t ≤ s/(1 - \theta ),\qquad where\qquad $(1 + \delta )↑2 =
2↑|≤u.\eqno (27)$
|qctr \6skip Proof. \xskip For each ministep $i↓k, 1 ≤ k ≤ t$,
we have $a↓i|infinf k < a↓j|infinf k(1 + \delta )↑i|infsup k↑{-j}|infsup
k$ for some $j↓k < i↓k$. Let $I↓1, \ldotss$, I↓t$ be the intervals
($j↓1, i↓1], \ldotss$, $(j↓t, i↓t]$, where the notation $(j, i]$
stands for the set of all integers $k$ such that $j < k ≤ i$.
It is possible (see exercise 17) to find nonoverlapping intervals
$J↓1, \ldotss$, $J↓h = (j↑\prime↓{1}, i↑\prime↓{1}], \ldotss
, (j↑\prime↓{h}, i↑\prime↓{h}]$ such that
$$\quad $I↓1 ∪ \cdots ∪ I↓t = J↓1 ∪ \cdots ∪ J↓h$,
$$a$↓i|spose |infinf k|supinf ↑\prime < a↓j|spose |infinf k|supinf
↑\prime (1 + \delta )↑{2(i|spose ↓k↑↑-j|spose ↓k↑↑)},\qquad$
for\qquad $1 ≤ k ≤ h.\quad (28)$
|qright \6skip Now for all steps $i$ outside of the intervals
$J↓1, \ldotss$, $J↓h$ we have $a↓i ≤ 2a↓{i-1}$; hence if
$$$q = (i↑\prime↓{1} - j↑\prime↓{1}) + \cdots
+ (i↑\prime↓{h} - j↑\prime↓{h})$
|qctr \6skip we have
$$$2↑|≤l↑{(n)} ≤ n ≤ 2↑{r-q}(1 + \delta )↑{2q} = 2↑{λ(n)+s-(1-\theta
)q} ≤ 2↑{λ(n)+s-(1-\theta )t}$.
|qctr \12skip \quad Returning to the proof of Theorem E\null, let
us choose $\delta = 2↑|≤e↑{/4} - 1$, and let us divide the $r$
steps of each addition into three classes:
$$$t$ ministeps,\qquad $u$ doublings,\qquad and $v$ other steps,\qquad
$t + u + v = r.\eqno (29)$
|qctr \6skip Counting another way, we have $s$ small steps,
where $s + m = r$. By the hypotheses, Theorem A\null, and Lemma P\null,
we obtain the relations
$$$t ≤ s/(1 - ε/2),\qquad t + v ≤ 3.271s,\qquad s ≤ (1 - ε)m/λ(m).\eqno
(30)$
|qctr \6skip \quad Given $s, t, u, v$ satisfying these conditions,
there are at most
$$$\left({r\atop t + v}}\left({t + v\atop v}}\eqno (31)$
|qctr \6skip w¬????????????long to which class. Given such a
distribution of the steps, let us consider how the non-ministeps
can be selected: If step $i$ is one of the ``other'' steps in
(29), $a↓i ≥ (1 + \delta )a↓{i-1}$, so $a↓i = a↓j + a↓k$, where
$\delta a↓{i-1} ≤ a↓k ≤ a↓j ≤ a↓{i-1}$. Also $a↓j ≤ a↓i/(1 +
\delta )↑{i-j} ≤ 2a↓{i-1}/(1 + \delta )↑{i-j}$, so $\delta ≤
2/(1 + \delta )↑{i-j}$. This gives at most $β$ choices for $j$,
where $β$ is a constant that depends only on $\delta $. There
are also at most $β$ choices for $k$, so the number of ways
to asign $j$ and $k$ for each of the non-ministeps is at most
$$$β↑{2v}.\eqno (32)$
|qctr |newtype 58320---
%folio 587 galley 20 (C) Addison-Wesley 1978
Computer Programming\quad (A.-W./Knuth)\quad ch. 4\quad f. 587\quad
g. 20B

|qleft \12skip \quad Finally, once the ``$j$'' and ``$k$'' have
been selected for each of the non-ministeps, there are fewer
than
$$$\left({r↑2\atop t}}\eqno (33)$
|qctr \6skip ways to choose the $j$ and the $k$ for the ministeps:
We select $t$ distinct pairs $(j↓1, k↓1), \ldotss, (j↓t, k↓t)$
of indices in the range $0 ≤ k↓h ≤ j↓h < r$, in fewer than (33)
ways. Then for each ministep $i$, in turn, we use a pair of
indices $(j↓h, k↓h)$ such that
$$$\quad a) j↓h < i$;

b) a$↓j|infinf h + a↓k|infinf h$ is as small as
possible among the pairs not already used for\cr
\qquad \quad smaller ministeps $i$;

c) $a↓i = a↓j|infinf h + a↓k|infinf h$ satisfies
the definition of ministep.

|qleft \6skip If no such pair $(j↓h, k↓h)$ exists, we get no
addition chain; on the other hand, any addition chain with ministeps
in the designated places must be selected in one of these ways,
so (33) is an upper bound on the possibilities.

Thus the total number of possible addition chains
satisfying (26) is bounded by (31) times (32) times (33), summed
over all relevant $s, t, u$, and $v$. The proof of Theorem E
can now be completed by means of a rather standard estimation
of these functions (exercise 18).

\thbegin Corollary. {\sl $l(n)$ is asymptotically
λ(n) + λ(n)/λλ(n), for almost all n. More precisely, there is
a function f(n) such that f(n) → 0 as n → ∞, and$
$$Pr \biglp |$l(n) - λ(n) - λ(n)/λλ(n)| ≥ f(n)λ(n)/λλ(n)\bigrp
= 0.\eqno (34)$
|qctr \6skip (See Section 3.5 for this definition of the probability
``Pr''.)

|qleft \12skip {\sl Proof. \xskip} The upper bound (25) shows
that (34) holds without the absolute value signs. The lower
bound comes from Theorem E\null, if we let $f(n)$ decrease to zero
slowly enough so that, when $f(n) ≤ \cdot $, the value $N$ is
so large that at most $\cdot N$ values $n ≤ N$ have $l(n) ≤
λ(n) + (1 - \cdot )λ(n)/λλ(n)$.

\subsectionbegin{Star chains} Optimistic peiple find
it reasonable to suppose that $l(n) = l??(n)$; given an addition
chain of minimal length $l(n)$, it appears hard to believe that
we cannot find one of the same length that satisfies the (apparently
mild) star condition. But in 1958 Walter Hansen proved the remarkable
theorem that, for certain large values of $n, l(n)$ is definitely
less than $l??(n)$, and he also proved several related theorems
that we will now investigate.
Hansen's theorems begin with an investigation of
the detailed structure of a star chain. This structure iis??????\quad
Hansen's theorems begin with an investigation of the detailed
structure of a star chain. This structure is given in terms
of other sequences and sets constructed from the given chain.
Let $n = 2↑e|infsup 0 + 2↑e|infsup 1 +\cdots + 2↑e|infsup
t, e↓0 > e↓1 >\cdots > e↓t ≥ 0$, and let 1 = $a↓0
< a↓1 <\cdots < a↓r = n$ be a star chain for {\sl
n. If there are d} doublings in this chain, we define the auxiliary
sequence
$$$0 = d↓0 ≤ d↓1 ≤ d↓2 ≤\cdots ≤ d↓r = d\eqno (35)$
|qctr \6skip where $d↓i$ is the number of doublings among steps
$1, 2, \ldotss$, $i$. We also define a sequence of ``multisets''
$S↓0, S↓1, \ldotss$, $S↓r$, which keep track of the powers of
2 present in the chain. (A {\sl multiset} is a mathematical
entity that is like a set but it is allowed to contain repeated
elements; an object may be an element of a multiset several
times, and its multiplicity of occurrences is relevant. See
exercise 19 for familiar examples of multisets.) $S↓i$ is defined
by the rules
$$\quad a) $S↓0 = 0$;

b) If $a↓{i+1} = 2a↓i$, then $S↓{i+1} = \{x | x
- 1 \in S↓i\}$;

c) If $a↓{i+1} = a↓i + a↓k, k < i$, then $S↓{i+1}
= S↓i ∪ S↓k$.

|qleft \6skip (The symbol ∪ means that the multisets are combined,
adding the multiplicities.) From this definition it follows
that
$$$a↓i = \sum ↓{x\inS↓i} 2↑x,\eqno (36)$
|qctr \6skip where the terms in this sum are not necessarily
distinct, and in particular
$$$n = 2↑e|infsup 0 + 2↑e|infsup 1 +\cdots + 2↑e|infsup
t = \sum ↓{x\inS↓r} 2↑x.\eqno (37)$
|qctr \6skip The number of elements in the latter sum is at
most 2$↑f$, where $f = r - d$ is the number of nondoublings.

Since $n$ has two difflerent binary representations
in (37), we can partition the multiset $S↓r$ into multisets
$M↓0, M↓1, \ldotss$, $M↓t$ such that
$$$2↑e|infsup j = \sum k????????????\6skip 2↑e|infsup j = \sum
↓{x\inM↓j} 2↑x,\qquad 0 ≤ j ≤ t.\eqno (38)$
|qctr \6skip This can be done by arranging the elements of $S↓r$
into nondecreasing order $x↓1 ≤ x↓2 ≤\cdotss$, taking $M↓t =
\{x↓1, x↓2, \ldotss, x↓k\}$ where $2↑x|infsup 1 +\cdots
+ 2↑x|infsup k = 2↑e|infsup t$. This must be possible
since $e↓t$ is the smallest of the $e$'s. Similarly, $M↓{t-1}
= \{x↓{k+1}, x↓{k+2}, \ldotss, x↓{k↑}\}$, and so on; the process
is easily visualized in binary notation.

Let $M↓j$ contain $m↓j$ elements (counting multiplicities);
then $m↓j ≤ 2↑f - t$, since $S↓r$ has at most $2↑f$ elements
and it has been partitioned into $(t + 1)$ nonempty multisets.
By Eq.\ (38), we can sees ??????\6skip -m$↓v < (e↓j - e↓v) -
(d↓u - d↓i) < m↓j.\eqno (42)$
|qctr \12skip \quad Although Lemma K may not look extremely
powerful, it says (when $M↓{ij}$ contains an element in common
with $M↓{uv}$ and when $m↓j, m↓v$ are reasonably small) that
the number of doublings between steps $i$ and able to prove
a result analogous to Theorem B above, that $l??(n) = e↓0 +
t$, provided that the $e↓j$ are far enough apart. The next theorem
shows how this can be done.

\algbegin Theorem H (\rm W. Hansen, {\sl J. f\"ur die reine und
angew.\ Math.\ \bf 202} (1959), 129--136.) {\sl Let $n = 2$↑e|infsup
0 + 2↑e|infsup 1 +\cdots + 2↑e|infsup t, e↓0 > e↓1
>\cdots > e↓t ≥ 0. If$
|qleft ???\6skip e$↓0 > 2e↓1 + 3.271(t - 1)\quad$ and\quad $e↓{i-1}
≥ e↓i + 2m\quad$ for\quad $1 ≤ i ≤ t,\eqno (43)$
|qctr \6skip where $m = 2↑{\lfloor 3.271(t-1)\rfloor } - t,
then l??(n) = e↓0 + t$.
|qleft β????????\quad ??\quad ??\quad ??\quad ??\quad ??\quad
??\quad ??\quad ??\quad ??\quad ??\quad ??\quad ??\quad ??\quad
??\quad ??\quad ??\quad ??\quad ??\quad ??\quad ??\quad